Tet transactivator system

ABSTRACT

A transcriptional activator of  T. gondii  is provided which comprises the tetracycline repressor (TetR) operatively linked to a transacting factor of  T. gondii . Strains of  T. gondii  transformed with a vector containing such a transactivator may be used to prepare vaccine compositions or to identify essential genes in the parasite. The system provided may be useful in other Apicomplexan species such as  Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii, Plasmodium knowlesi, Trypanosoma brucei, Entamoeba histolytica,  and  Giardia lambia.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] Not applicable.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

[0002] Not applicable.

BACKGROUND OF INVENTION

[0003] 1. Field of the Invention

[0004] The present invention relates to nucleic acid constructs that can act as inducible transactivator systems in Apicomplexan parasites which can be used to create attenuated strains of the parasites that can act as vaccines to protect against infection by wild-type parasite. The transactivator system also permits the systematic study of the genes in Apicomplexan parasites.

[0005] 2. Background Art

[0006] An inducible control of individual gene expression is a prerequisite to study the function of essential genes. Several strategies and tools associated with DNA transformation have been developed in the human and animal pathogens of Toxoplasma gondii and the Plasmodium species. The currently available methods to study essential gene function are antisense RNA and ribozyme technology (Nakaar et al J. Biol. Chem. 274 5083-5087 (1999); Gardiner et al Mol. Biochem. Parasitol 110 33-41)). The development of a controlled gene expression system would not only permit the generation of conditional knockouts but would also allow the study of mutated forms of endogenous genes and the expression of toxic genes.

[0007] In the originally described tetracycline-controlled inducible expression system (Gossen, M. & Bujard, H., Proc. Nat'l Acad. Sci. USA 89 5547-5551 (1992), the fusion of the tetracycline repressor (TetR) with the activating domain of the Herpes simplex virion protein 16 (VP16) has converted the repressor into an efficient tetracycline-controlled transactivator (tTA). In that case, a minimal promoter fused to tetracycline operator (tetO) sequences is activated in cells expressing tTA and becomes silent in the presence of tetracycline. This system is highly efficient in regulating genes in diverse eukaryotic organisms but has not been established in any protozoan parasite

[0008] In contrast, the TetR system regulates gene expression in a number of protozoan parasites, including Trypanosoma brucei (Wirtz, E. & Clayton, C., Science 268 1179-1183 (1995)), Entamoeba histolytica (Hamann et al Mol. Biochem. Parasitol. 84 83-91 (1997)), Giardia lambia (Sun, C. H. & Tai, J. H., Mol. Biochem. Parasitol. 105 51-60 (2000)), and Toxoplasma gondii (Meissner et al Nucleic Acids Res. 29 e115 (2001). As in bacteria, TetR interferes with initiation of transcription by binding to tetO sequences, placed properly in the vicinity of the promoter region of protozoan genes. In the presence of tetracycline the repressor ceases to bind to the tetO sequence and thus interference is abolished, rendering the promoter active.

[0009]T. gondii exhibits a remarkably high frequency of stable transformation coupled to the preferential integration at random throughout the genome which have previously been exploited to design insertional mutagenesis strategies leading to the cloning of non-essential genes and to the identification of developmentally regulated genes by promoter trapping (Donald et al J. Biol. Chem. 271 14010-14019 (1996); Knoll, L. J., & Boothroyd, J. C., Mol. Cell. Biol. 18 807-814 (1998)).

[0010] An inducible system based on the tet-Repressor has been reported to control gene expression in several protozoan parasites but best optimised and applied predominantly in Trypanosoma brucei. Indeed, the existence of trans-splicing in kinetoplastida offers a unique opportunity to combine this tetracycline-dependent repression with the powerful T7 polymerase transcription (Wirtz et al Mol. Biochem. Parasitol. 99 89-101 (1999)). In contrast, the broadly used and tighter transactivator system, (tTA) composed of tetR-VP16 fusion has not been reported to function in any protozoan parasites. Recent studies to investigate the use of the tet-Repressor to control gene expression in T. gondii found that the tTA was totally inactive (Meissner et al Nucleic Acids Res. 29 e115 (2001)). While the repression system is suitable for the expression of toxic genes and dominant negative mutants, the necessity to treat the parasites continuously and anhydrotetracycline (ATc) during the procedures of selection and cloning render it inappropriate for the generation of conditional knockouts.

[0011] There exists a need therefore to overcome this considerable limitation to the further study of Apicomplexan parasites to enable the application of techniques based on inducible transactivator technology.

SUMMARY OF INVENTION

[0012] According to a first aspect of the present invention, there is provided a nucleic acid construct comprising the tetracycline repressor (TetR) operatively linked to a transacting factor of T. gondii.

[0013] According to a second aspect of the invention, there is provided a host cell transformed with a nucleic acid construct according to the first aspect. The host cell can be a bacterium, for example Escherichia coli, a yeast cell, for example Saccharomyces cerevisiae, or Schizosaccharomyces pombe, or a protozoan, for example Toxoplasma gondii or Plasmodium species, for example Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii or Plasmodium knowlesi, or Trypanosoma brucei, or Entamoeba histolytica, or Giardia lambia.

[0014] According to a third aspect of the invention, there is provided a nucleic acid construct according to the first aspect of the invention for use in medicine. This aspect of the invention therefore extends to a method of treatment for or prevention of an infection caused by a protozoan, for example Toxoplasma gondii or Plasmodium species, for example Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii or Plasmodium knowlesi, or Trypanosoma brucei, or Entamoeba histolytica, or Giardia lambia comprising administration of a nucleic acid construct according to the first aspect of the invention.

[0015] According to a fourth aspect of the invention, there is provided a vaccine composition comprising a protozoan, for example Toxoplasma gondii or Plasmodium species, for example Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii or Plasmodium knowlesi, or Trypanosoma brucei, or Entamoeba histolytica, or Giardia lambia transfected with a nucleic acid construct according to the first aspect of the invention.

[0016] According to a fifth aspect of the invention, there is provided the use of a nucleic acid construct according to the first aspect of the invention in the preparation of a vaccine for use in the treatment or prophylaxis of an infection caused by protozoan, for example Toxoplasma gondii or Plasmodium species, for example Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii or Plasmodium knowlesi, or Trypanosoma brucei, or Entamoeba histolytica, or Giardia lambia.

[0017] According to a sixth aspect of the invention, there is provided a process for the preparation of a nucleic acid construct according to the first aspect of the invention, the process comprising ligating together nucleic acid sequences encoding a tetracycline-controlled transactivator and a transacting factor of T. gondii, optionally including linker or additional sequences.

[0018] According to a seventh aspect of the invention, there is provided a process for the preparation of a host cell according to the second aspect of the invention, the process comprising transfecting a cell with a nucleic acid construct according to the first aspect of the invention.

[0019] According to a eighth aspect of the invention, there is provided a process for the preparation of a vaccine composition according to the fourth aspect of the invention, the process comprising transfecting a host cell with a nucleic acid construct according to the first aspect of the invention.

[0020] Other aspects and advantages of the invention will be apparent from the following description and the appended claims.

BRIEF DESCRIPTION OF DRAWINGS

[0021] FIGS. 1(a)-1(d) show trapping of a functional transactivator in T. gondii. FIG. 1(a) shows the scheme of the transactivator-TRAP strategy. FIG. 1(b) shows the clone Tati-1 regulated LacZ expression in ATc-dependent manner. FIG. 1(c) shows the amino acid sequence of the transactivating domain of TATi-1 (lower sequence line, starting “-PTF”) fused to tetR (upper sequence line, starting “..HQ”). FIG. 1(d) shows transient transfection of p7TetOS1LacZ-CAT into RH parasites or in strains expressing TATi or tTA^(2s).

[0022] FIGS. 2(a)-2(e) show the generation of a conditional knockout for TgMyoA gene. FIG. 2(a) shows modulation of mycMyoA transgene expression in TATi-1 parasites. FIG. 2(b) shows Western blot analysis of mycMyoA expression. FIG. 2(c) shows detection of endogenous MyoA and myMyoA genes by analytical PCR on genomic DNA from RH, TATi-1 transformed with T7S4mycMyoA, myoako1 and myoako2. FIG. 2(d) shows analysis of myoako1 by Western blot with anti-MyoA antibodies which reveals that this clone lacked endogenous TgMyoA. FIG. 2(e) shows inducibility of mycMyoA expression in clones myoako1 and myoako2 which have regulatable mycMyoA.

[0023] FIGS. 3(a)-3(d) show phenotypic consequences of TgMyoA depletion for parasitic propagation in culture. FIG. 3(a) shows plaque assay for myoako1 and RH-wt. FIG. 3(b) shows invasion-assay of myoako1 in comparison to RH-wt parasites. FIG. 3(c) shows egression assay of myoako. FIG. 3(d) shows quantification of egress in function of incubation times after addition of Ca-ionophore A23187.

[0024] FIGS. 4(a) and 4(b) show the parasites depleted in TgMyoA are avirulent in mice and confer protection against new challenge with wild-type parasites. FIG. 4(a) shows tachyzoites from RH or myoako1 mutants that were injected i.p. into BALB/c mice and monitored for more than 30 days. FIG. 4(b) shows the development/induction of T. gondii-specific T cells after infection with myoako1 mutants.

[0025] FIGS. 5(a)-5(j) show the plasmid maps (with the respective nucleotide sequences shown in the sequence listing) for the vectors referred to in the Examples and FIGS. 1 to 4. FIGS. 5(a) to (d) show the vectors used to conditionally express MyoA or GFP: FIG. 5(a) shows pTetO7Sag4-MyoA, FIG. 5(b) shows pTetO7Sag1-MyoA, FIG. 5(c) shows pTetO7Sag4-GFP, and FIG. 5(d) shows pTetO7Sag1-GFP. FIGS. 5(e) to (g) show the vectors used to generate the recipient strain for the transactivator screening by random insertion: FIG. 5(e) shows pTetO7Sag1-HXGPRT, FIG. 5(f) shows pTetO7Sag1-LacZ, and FIG. 5(g) shows pTetO7Sag4-LacZ. FIG. 5(h) shows the vector used for the random integration pTub8TetRsynthetic: Ptub8TetR-GCN5-DHFRTS. FIGS. 5(i) and 5(j) show the vectors used to express a functional transactivator: FIG. 5(i) shows pTub8TATi-1-HXGPRT, and FIG. 5(j) shows pTub8TATi-3-HXGPRT.

[0026] FIGS. 6(a) and 6(b) show the nucleic acid sequences and presumed amino acid sequences of TATi-1 and TATi-3, respectively.

DETAILED DESCRIPTION

[0027] The TetR is described in Gossen, M. & Bujard, H. (1992)(Proc. Nat'l Acad. Sci USA 89 5547-5551) and sequence elements are shown in the constructs of FIG. 5. Transacting factors of T. gondii can be TATi-1 or TATi-3 in which the factor comprises a fusion protein of TetR and a T. gondii activating domain having a sequence as shown in FIG. 6. Alternatively, additional transacting factors can be identified using the methodology described in the present application. For example, using a library of degenerated oligonucleotides fused to the TetR could lead to the identification of artificial transcriptional activating domains.

[0028] The transacting factor may be TATi-1, TATi-3, or an analog, homolog, ortholog, related polypeptide, derivative, fragment or isoform thereof. The fusion protein formed between TetR and the activating domain may be a contiguous fusion of the two peptide sequences, or one or more additional linker amino acids may be inserted between the protein domains. Alternatively, one or more C-terminal residues from TetR may be truncated, N-terminal residues from the T. gondii activating domain.

[0029] The term “analog” as used herein refers to a polypeptide that possesses a similar or identical function as a transacting factor of T. gondii (TATi) but need not necessarily comprise an amino acid sequence that is similar or identical to the amino acid sequence of the TATi, or possess a structure that is similar or identical to that of the TATi. As used herein, an amino acid sequence of a polypeptide is “similar” to that of a TATi if it satisfies at least one of the following criteria: (a) the polypeptide has an amino acid sequence that is at least 30% (more preferably, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% or at least 99%) identical to the amino acid sequence of the TATi; (b) the polypeptide is encoded by a nucleotide sequence that hybridizes under stringent conditions to a nucleotide sequence encoding at least 5 amino acid residues (more preferably, at least 10 amino acid residues, at least 15 amino acid residues, at least 20 amino acid residues, at least 25 amino acid residues, at least 40 amino acid residues, at least 50 amino acid residues, at least 60 amino residues, at least 70 amino acid residues, at least 80 amino acid residues, at least 90 amino acid residues, at least 100 amino acid residues, at least 125 amino acid residues, or at least 150 amino acid residues) of the SPI; or (c) the polypeptide is encoded by a nucleotide sequence that is at least 30% (more preferably, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% or at least 99%) identical to the nucleotide sequence encoding the TATi. As used herein, a polypeptide with “similar structure” to that of a TATi refers to a polypeptide that has a similar secondary, tertiary or quarternary structure as that of the TATi. The structure of a polypeptide can determined by methods known to those skilled in the art, including but not limited to, X-ray crystallography, nuclear magnetic resonance, and crystallographic electron microscopy.

[0030] Similarity of TATi polypeptides can also be determined functionally by transfecting a suitable host cell with a nucleic acid construct containing DNA encoding the polypeptide and monitoring for transactivating activity as herein described.

[0031] The term “TATi fusion protein” as used herein refers to a polypeptide that comprises (i) an amino acid sequence of a TATi, a TATi fragment, a TATi-related polypeptide or a fragment of an TATi-related polypeptide and (ii) an amino acid sequence of a heterologous polypeptide (i.e., a non-TATi, non-TATi fragment or non-TATi-related polypeptide), which will generally be TetR.

[0032] The term “TATi homolog” as used herein refers to a polypeptide that comprises an amino acid sequence similar to that of a TATi but does not necessarily possess a similar or identical function as the TATi.

[0033] The term “TATi ortholog” as used herein refers to a non-T. gondii polypeptide that (i) comprises an amino acid sequence similar to that of a TATi and (ii) possesses a similar or identical function to that of the TATi.

[0034] The term “TATi-related polypeptide” as used herein refers to a TATi homolog, a TATi analog, an isoform of TATi, a TATi ortholog, or any combination thereof.

[0035] The term “derivative” as used herein refers to a polypeptide that comprises an amino acid sequence of a second polypeptide which has been altered by the introduction of amino acid residue substitutions, deletions or additions. The derivative polypeptide possess a similar or identical function as the second polypeptide.

[0036] The term “fragment” as used herein refers to a peptide or polypeptide comprising an amino acid sequence of at least 5 amino acid residues (preferably, at least 10 amino acid residues, at least 15 amino acid residues, at least 20 amino acid residues, at least 25 amino acid residues, at least 40 amino acid residues, at least 50 amino acid residues, at least 60 amino residues, at least 70 amino acid residues, at least 80 amino acid residues, at least 90 amino acid residues, at least 100 amino acid residues, at least 125 amino acid residues, at least 150 amino acid residues, at least 175 amino acid residues, at least 200 amino acid residues, or at least 250 amino acid residues) of the amino acid sequence of a second polypeptide. The fragment of an SPI may or may not possess a functional activity of the second polypeptide.

[0037] The term “isoform” as used herein refers to variants of a polypeptide that are encoded by the same gene, but that differ in their pI or MW, or both. Such isoforms can differ in their amino acid composition (e.g. as a result of alternative splicing or limited proteolysis) and in addition, or in the alternative, may arise from differential post-translational modification (e.g., glycosylation, acylation, phosphorylation). As used herein, the term “isoform” also refers to a protein that exists in only a single form, i.e., it is not expressed as several variants.

[0038] The term “modulate” when used herein in reference to expression or activity of a TATi or a TATi-related polypeptide refers to the upregulation or downregulation of the expression or activity of the TATi or a TATi-related polypeptide. Based on the present disclosure, such modulation can be determined by assays known to those of skill in the art or described herein.

[0039] The percent identity of two amino acid sequences or of two nucleic acid sequences is determined by aligning the sequences for optimal comparison purposes (e.g., gaps can be introduced in the first sequence for best alignment with the sequence) and comparing the amino acid residues or nucleotides at corresponding positions. The “best alignment” is an alignment of two sequences which results in the highest percent identity. The percent identity is determined by the number of identical amino acid residues or nucleotides in the sequences being compared (i.e., % identity=# of identical positions/total # of positions×100).

[0040] The determination of percent identity between two sequences can be accomplished using a mathematical algorithm known to those of skill in the art. An example of a mathematical algorithm for comparing two sequences is the algorithm of Karlin and Altschul Proc. Natl. Acad. Sci. USA (1990) 87:2264-2268, modified as in Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5877. The NBLAST and XBLAST programs of Altschul et al, J. Mol. Biol. (1990) 215:403-410 have incorporated such an algorithm. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to a nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to a protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al, Nucleic Acids Res. (1997) 25:3389-3402. Alternatively, PSI-Blast can be used to perform an iterated search which detects distant relationships between molecules (Id.). When utilizing BLAST, Gapped BLAST, and PSI-Blast programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See http://www.ncbi.nlm.nih.gov.

[0041] Another example of a mathematical algorithm utilized for the comparison of sequences is the algorithm of Myers and Miller, CABIOS (1989). The ALIGN program (version 2.0) which is part of the GCG sequence alignment software package has incorporated such an algorithm. Other algorithms for sequence analysis known in the art include ADVANCE and ADAM as described in Torellis and Robotti Comput. Appl. Biosci. (1994) 10:3-5; and FASTA described in Pearson and Lipman Proc. Natl. Acad. Sci. USA (1988) 85:2444-8. Within FASTA, ktup is a control option that sets the sensitivity and speed of the search.

[0042] In the present invention, the transacting factor of T. gondii may be TATi, although it is envisaged that alternative synthetic forms of the polypeptide could be made by substitution of one or more amino acids in the molecule. The invention therefore extends to the use of a molecule having TATi activity. The skilled person is aware that various amino acids have similar properties. One or more such amino acids of a substance can often be substituted by one or more other such amino acids without eliminating a desired activity of that substance. Thus the amino acids glycine, alanine, valine, leucine and isoleucine can often be substituted for one another (amino acids having aliphatic side chains). Of these possible substitutions it is preferred that glycine and alanine are used to substitute for one another (since they have relatively short side chains) and that valine, leucine and isoleucine are used to substitute for one another (since they have larger aliphatic side chains which are hydrophobic). Other amino acids which can often be substituted for one another include: phenylalanine, tyrosine and tryptophan (amino acids having aromatic side chains); lysine, arginine and histidine (amino acids having basic side chains); aspartate and glutamate (amino acids having acidic side chains); asparagine and glutamine (amino acids having amide side chains); and cysteine and methionine (amino acids having sulphur containing side chains). Substitutions of this nature are often referred to as “conservative” or “semi-conservative” amino acid substitutions.

[0043] Amino acid deletions or insertions may also be made relative to the amino acid sequence of TATi. Thus, for example, amino acids which do not have a substantial effect on the activity of TATi, or at least which do not eliminate such activity, may be deleted. Amino acid insertions relative to the sequence of TATi can also be made. This may be done to alter the properties of a substance of the present invention (e.g. to assist in identification, purification or expression, where the protein is obtained from a recombinant source, including a fusion protein. Such amino acid changes relative to the sequence of TATi from a recombinant source can be made using any suitable technique e.g. by using site-directed mutagenesis. The TATi molecule may, of course, be prepared by standard chemical synthetic techniques, e.g. solid phase peptide synthesis, or by preparation of nucleic acid encoding TATi, and subsequently expression of the nucleic acid in a suitable host cell system.

[0044] It should be appreciated that amino acid substitutions or insertions within the scope of the present invention can be made using naturally occurring or non-naturally occurring amino acids. Whether or not natural or synthetic amino acids are used, it is preferred that only L-amino acids are present.

[0045] Whatever amino acid changes are made (whether by means of substitution, insertion or deletion), preferred polypeptides of the present invention have at least 50% sequence identity with a polypeptide as defined in a) above more preferably the degree of sequence identity is at least 75%. Sequence identities of at least 90% or at least 95% are most preferred.

[0046] The degree of amino acid sequence identity can be calculated using a program such as “bestfit” (Smith and Waterman, Advances in Applied Mathematics, 482-489 (1981)) to find the best segment of similarity between any two sequences. The alignment is based on maximising the score achieved using a matrix of amino acid similarities, such as that described by Schwarz and Dayhof (1979) Atlas of Protein Sequence and Structure, Dayhof, M. O., Ed pp 353-358. Where high degrees of sequence identity are present there will be relatively few differences in amino acid sequence.

[0047] The nucleic acid encoding the transacting sequence of T. gondii can be a sequence complementary to, or homologous with the nucleic sequence for TATi-1 or TATi-3.

[0048] A nucleic acid sequence which is complementary to a nucleic acid sequence useful in a method of the present invention is a sequence which hybridises to such a sequence under stringent conditions, or a nucleic acid sequence which is homologous to or would hybridise under stringent conditions to such a sequence but for the degeneracy of the genetic code, or an oligonucleotide sequence specific for any such sequence. The nucleic acid sequences include oligonucleotides composed of nucleotides and also those composed of peptide nucleic acids. Where the nucleic sequence is based on a fragment of the gene encoding TATi, the fragment may be at least any ten consecutive nucleotides from the gene, or for example an oligonucleotide composed of from 20, 30, 40, or 50 nucleotides.

[0049] Stringent conditions of hybridisation may be characterised by low salt concentrations or high temperature conditions. For example, highly stringent conditions can be defined as being hybridisation to DNA bound to a solid support in 0.5M NaHPO₄, 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 65° C. and washing in 0.1×SSC/0.1% SDS at 68° C. (Ausubel et al eds. “Current Protocols in Molecular Biology” 1, page 2.10.3, published by Green Publishing Associates, Inc. and John Wiley & Sons, Inc., New York, (1989)). In some circumstances less stringent conditions may be required. As used in the present application, moderately stringent conditions can be defined as comprising washing in 0.2×SSC/0.1% SDS at 42° C. (Ausubel et al (1989) supra). Hybridisation can also be made more stringent by the addition of increasing amounts of formamide to destabilise the hybrid nucleic acid duplex. Thus particular hybridisation conditions can readily be manipulated, and will generally be selected according to the desired results. In general, convenient hybridisation temperatures in the presence of 50% formamide are 42° C. for a probe which is 95 to 100% homologous to the target DNA, 37° C. for 90 to 95% homology, and 32° C. for 70 to 90% homology.

[0050] Examples of preferred nucleic acid sequences for use in a method of the present invention are shown in the attached Figures.

[0051] The nucleic acid constructs of this aspect of the invention can be provided in the form of vectors, suitably expression vectors.

[0052] The term “vector” or “expression vector” generally refers to any nucleic acid vector which may be RNA, DNA or cDNA.

[0053] The term “expression vector” may include, among others, chromosomal, episomal, and virus-derived vectors, for example, vectors derived from bacterial plasmids, from bacteriophage, from transposons, from yeast episomes, from insertion elements, from yeast chromosomal elements, from viruses such as baculoviruses, papova viruses, such as SV40, vaccinia viruses, adenoviruses, fowl pox viruses, pseudorabies viruses and retroviruses, and vectors derived from combinations thereof, such as those derived from plasmid and bacteriophage genetic elements, such as cosmids and phagemids. Generally, any vector suitable to maintain, propogate or express nucleic acid to express a polypeptide in a host may be used for expression in this regard.

[0054] In certain embodiments of the invention, the vectors may provide for specific expression. Such specific expression may be inducible expression or expression only in certain types of cells or both inducible and cell-specific. Preferred among inducible vectors are vectors that can be induced for expression by environmental factors that are easy to manipulate, such as temperature and nutrient additives. Particularly preferred among inducible vectors are vectors that can be induced for expression by changes in the levels of chemicals, for example, chemical additives such as antibiotics. A variety of vectors suitable for use in the invention, including constitutive and inducible expression vectors for use in prokaryotic and eukaryotic hosts, are well known and employed routinely by those skilled in the art.

[0055] Recombinant expression vectors will include, for example, origins of replication, a promoter preferably derived from a highly expressed gene to direct transcription of a structural sequence, and a selectable marker to permit isolation of vector containing cells after exposure to the vector.

[0056] Expression vectors may comprise an origin of replication, a suitable promoter and enhancer, and also any necessary ribosome binding sites, polyadenylation regions, splice donor and acceptor sites, transcriptional termination sequences, and 5′-flanking non-transcribed sequences that are necessary for expression. Preferred expression vectors according to the present invention may be devoid of enhancer elements.

[0057] The promoter sequence may be any suitable known promoter, for example the human cytomegalovirus (CMV) promoter, the CMV immediate early promoter, the HSV thymidine kinase promoter, the early and late SV40 promoters or the promoters of retroviral LTR's, such as those of the Rous sarcoma virus (“RSV”), and metallothionein promoters, such as the mouse metallothionein-I promoter. The promoter may comprise the minimum sequence required for promoter activity (such as a TATA box without enhancer elements), for example, the minimal sequence of the CMV promoter (mCMV). The promoter, if present, can be contiguous to the TetR sequence.

[0058] The expression vectors or vectors of the invention can be derived from a vector devoid of its own promoter and enhancer elements, for example the plasmid vector PGL2. Enhancers are able to bind to promoter regions situated several thousands of bases away through DNA folding (Rippe et al TIBS 1995; 20: 500-506 (1995)).

[0059] The expression vectors may also include selectable markers, such as antibiotic resistance, which enable the vectors to be propagated.

[0060] The nucleic acid sequence of the first aspect of the invention may additionally comprise a reporter transcription unit lacking a promoter region, such as a chloramphenicol acetyl transferase (“CAT”) or DHFR-TS transcription unit. As is well known, introduction into an expression vector of a promoter-containing fragment at a restriction site upstream of the CAT gene engenders the production of CAT activity, which can be detected by standard CAT assays. The application of reporter genes relates to the phenotype of these genes which can be assayed in a transformed organism and which is used, for example, to analyse the induction and/or repression of gene expression. Reporter genes for use in studies of gene regulation include other well known reporter genes including the lux gene encoding luciferase which can be assayed by a bioluminescence assay, the uidA gene encoding β-glucuronidase which can be assayed by a histochemical test, the aphIV gene encoding hygromycin phosphotransferase which can be assayed by testing for hygromycin resistance in the transformed organism, the dhfr gene encoding dihydrofolate reductase which can be assayed by testing for methotrexate resistance in the transformed organism, the neo gene encoding neomycin phosphotransferase which can be assayed by testing for kanamycin resistance in the transformed organism and the lacZ gene encoding β-galactosidase which can be assayed by a histochemical test. All of these reporter genes are obtainable from E. coli except for the lux gene. Sources of the lux gene include the luminescent bacteria Vibrio harveyii and V. fischeri, the firefly Photinus pyralis and the marine organism Renilla reniformis.

[0061] The invention can also be described as providing an Apicomplexan tetracycline-inducible transactivator (TATi) system, comprising the tetracycline repressor (TetR) and a transacting factor of T. gondii. Alternatively, it can be described as providing a tetracycline-inducible transactivator (TATi) system, comprising the tetracycline repressor (TetR) and a transacting factor of T. gondii for use in Apicomplexan species.

[0062] The advantages of the invention extend to the production of live attenuated vaccines suitable to prevent infection by Apicomplexan parasites, the provision of a system to permit the generation of conditional knock-outs of essential gene(s) in such parasites leading to a greater understanding of the parasites metabolism which may allow for the design of new pharmaceutical agents to block or inhibit the function of the essential gene(s). This system allows for the identification of essential genes and for the validation of such genes as drug targets or vaccine candidates.

[0063] To unravel the function of essential genes in an Apicomplexan parasite, for example Toxoplasma gondii or Plasmodium species, the present invention has established a tetracycline-inducible transactivator system (TATi), which ectopically controls gene expression. In a mutant T. gondii strain expressing TATi, a second copy of a gene of interest can be introduced into the cell under the control of the Tet promoter, and the function of the native gene disrupted, for example by homologous recombination, or another gene targeted insertion sufficient to prevent normal gene function. The mutant obtained by this procedure is a fully conditional mutant thus enabling study of the gene concerned.

[0064] According to a second aspect of the invention, there is provided a host cell transformed with a nucleic acid construct according to the first aspect. The host cell can be a bacterium, for example Escherichia coli, a yeast cell, for example Saccharomyces cerevisiae, or Schizosaccharomyces pombe, or a protozoan, for example Toxoplasma gondii or Plasmodium species, for example Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii or Plasmodium knowlesi, or Trypanosoma brucei, or Entamoeba histolytica, or Giardia lambia.

[0065] Introduction of an expression vector into the host cell can be effected by calcium phosphate transfection, DEAE-dextran mediated transfection, microinjection, cationic lipid-mediated transfection, electroporation, transduction, scrape loading, ballistic introduction, infection of other methods. Such methods are described in many standard laboratory manuals, such as Sambrook et al., Molecular Cloning: A Laboratory Manual, 2^(nd) Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989).

[0066] According to a third aspect of the invention, there is provided a nucleic acid construct according to the first aspect of the invention for use in medicine. Typically, such uses will be for the treatment or prevention of infections caused by a protozoan, for example Toxoplasma gondii or Plasmodium species, for example Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii or Plasmodium knowlesi, or Trypanosoma brucei, or Entamoeba histolytica, or Giardia lambia. This aspect of the invention therefore extends to a method of treatment for or prevention of an infection caused by a protozoan, for example Toxoplasma gondii or Plasmodium species, for example Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii or Plasmodium knowlesi, or Trypanosoma brucei, or Entamoeba histolytica, or Giardia lambiacomprising administration of a nucleic acid construct according to the first aspect of the invention.

[0067] Malaria is a disease condition caused in animals by the Plasmodium spp parasites characterised by fever in mild forms and by metabolic acidosis, severe anaemia and cerebral malaria, in severer forms, and sometimes in the death of the subject infected. It is also possible for asymptomatic infection to occur in some affected subjects. In humans, the disease is caused by Plasmodium falciparum and to a lesser extent by Plasmodium vivax where the parasites are transmitted by Anopheles spp mosquitoes. Malaria is caused in rodents by Plasmodium berghei, Plasmodium yoelii and in rhesus monkeys by Plasmodium knowlesi.

[0068] According to a fourth aspect of the invention, there is provided a vaccine composition comprising a protozoan, for example Toxoplasma gondii or Plasmodium species, for example Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii or Plasmodium knowlesi, or Trypanosoma brucei, or Entamoeba histolytica, or Giardia lambia transfected with a nucleic acid construct according to the first aspect of the invention.

[0069] According to a fifth aspect of the invention, there is provided the use of a nucleic acid construct according to the first aspect of the invention in the preparation of a vaccine for use in the treatment or prophylaxis of an infection caused by protozoan, for example Toxoplasma gondii or Plasmodium species, for example Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii or Plasmodium knowlesi, or Trypanosoma brucei, or Entamoeba histolytica, or Giardia lambia.

[0070] According to a sixth aspect of the invention, there is provided a process for the preparation of a nucleic acid construct according to the first aspect of the invention, the process comprising ligating together nucleic acid sequences encoding a tetracycline-controlled transactivator and a transacting factor of T. gondii, optionally including linker or additional sequences.

[0071] According to a seventh aspect of the invention, there is provided a process for the preparation of a host cell according to the second aspect of the invention, the process comprising transfecting a cell with a nucleic acid construct according to the first aspect of the invention.

[0072] According to a eighth aspect of the invention, there is provided a process for the preparation of a vaccine composition according to the fourth aspect of the invention, the process comprising transfecting a host cell with a nucleic acid construct according to the first aspect of the invention.

[0073] As described above, the nucleic acid constructs, vectors or expression vectors of the invention can be used in medicine, and the invention therefore extends to compositions comprising the nucleic acid construct according to the first aspect and embodiments of the subsequent aspects as appropriate of the invention. Therefore, the nucleic acid constructs, vectors, or expression vectors or systems of the present invention may be employed in combination with a pharmaceutically acceptable carrier or carriers.

[0074] Such carriers may include, but are not limited to, saline, buffered saline, dextrose, liposomes, water, glycerol, ethanol and combinations thereof.

[0075] The nucleic acid construct, expression vector or vectors of the invention may be employed alone or in conjunction with other compounds, such as therapeutic compounds.

[0076] The pharmaceutical compositions may be administered in any effective, convenient manner effective for treating a patients disease including, for instance, administration by oral, topical, intravenous, intramuscular, intranasal, or intradermal routes among others. In therapy or as a prophylactic, the active agent may be administered to an individual as an injectable composition, for example as a sterile aqueous dispersion, preferably isotonic.

[0077] The invention also provides a kit of parts comprising a nucleic acid construct, expression vector or vector of the invention as defined above and an administration vehicle including, but not limited to, tablets for oral administration, inhalers for lung administration and injectable solutions for intravenous administration.

[0078] Preferred features of the second and subsequent features of the invention are as for the first aspect mutatis mutandis.

[0079] The invention will now be further described by way of example with reference to the following Examples which are present for the purposes of illustration only and are not to be construed as being limiting on the invention. In the Examples reference is made to a number of Figures in which:

[0080]FIG. 1 shows trapping of a functional transactivator in T. gondii. FIG. 1(a) shows the scheme of the transactivator-TRAP strategy. A linear DNA fragment encoding dhfrts selectable marker and the TetR without a STOP was stably integrated at random into the genome of the recipient strain. Upon integration in a locus where a functional transactivator fusion has been generated, the TetR-fusion activates expression of the selectable marker gene HXGPRT and of reporter gene LacZ. FIG. 1(b) shows the clone Tati-1 regulated LacZ expression in ATc-dependent manner. Parasites were grown for 48 hours in the presence of and absence of drug, fixed and stained with X-Gal to monitor LacZ expression as performed previously (Meissner et al Nucleic Acids Res. 29 e115 (2001)). FIG. 1(c) shows the amino acid sequence of the transactivating domain of TATi-1 (lower sequence line, starting “-PTF”) fused to tetR (upper sequence line, starting “..HQ”). FIG. 1(d) shows transient transfection of p7TetOS1LacZ-CAT into RH parasites or in strains expressing TATi or tTA^(2s). Cells were grown for 48 hours in the presence or absence of ATc before parasite lysates were prepared to quantify LacZ expression.

[0081]FIG. 2 shows the generation of a conditional knockout for TgMyoA gene. FIG. 2(a) shows modulation of mycMyoA transgene expression in TATi-1 parasites. Detection of mycMyoA by IFA, on intracellular parasites incubated in the presence of or absence of ATc for 48 hours, using mAb anti-myc. MIC4 was detected under the same setting and exposure time with rabbit polyclonals as control. FIG. 2(b) shows Western blot analysis of mycMyoA expression. Parasite lysates were probed with monoclonal anti-myc and polyclonal anti-MyoA. As internal standard, the lysate was probed with polyclonal anti-MIC4. The upper band corresponds to the inducible mycMyoA. Parasites were grown for 48 hours in the presence of or absence of ATc before lysates were prepared. FIG. 2(c) shows detection of endogenous MyoA and myMyoA genes by analytical PCR on genomic DNA from RH, TATi-1 transformed with T7S4mycMyoA, myoako1 and myoako2. Sequence specific primers were used to amplify endogenous TgMyoA (in an intron) and transgenic mycMyoA (in the 5′-UTR). FIG. 2(d) shows analysis of myoako1 by Western blot with anti-MyoA antibodies which reveals that this clone lacked endogenous TgMyoA. FIG. 2(e) shows inducibility of mycMyoA expression in clones myoako1 and myoako2 which have regulatable mycMyoA. Intracellular parasites were grown for 48 hours in the presence of or absence of ATc before preparation of total cell lysate. The level of detection of TgMIC4 was used as control for equal loading.

[0082]FIG. 3 shows phenotypic consequences of TgMyoA depletion for parasitic propagation in culture. FIG. 3(a) shows plaque assay for myoako1 and RH-wt. Parasites were continuously grown on HFF-monolayer in the presence of or absence of ATc for 10 days before fixation and staining of the cells with GIEMSA. FIG. 3(b) shows invasion-assay of myoako1 in comparison to RH-wt parasites. 5.10⁵ freshly lysed parasites grown in the presence or absence of ATc for 48 hours were inoculated on HFF-monolayer for 20 minutes followed by a washing step to remove extracellular parasites. Cells were further incubated for 24 hours before fixation. The number of vacuoles represented successful invasion events and were counted in 40 eye-fields for each parasite. 100% represents the number of vacuoles in absence of ATc for RH and myoako1 respectively. FIG. 3(c) shows egression assay of myoako. Parasites were grown for 36 hours of HFF-cells in presence of and absence of ATc. Cells were fixed 5 minutes after addition of Ca-ionophore A23187 according to Black et al (Mol. Cell. Biol. 20 9399-9408 (2000)).and analysed by IFA using anti-SAG1 antibodies. FIG. 3(d) shows quantification of egress in function of incubation times after addition of Ca-ionophore A23187.

[0083]FIG. 4 shows the parasites depleted in TgMyoA are avirulent in mice and confer protection against new challenge with wild-type parasites. FIG. 4(a) shows tachyzoites from RH or myoako1 mutants that were injected i.p. into BALB/c mice and monitored for more than 30 days. Groups of mice were given 0.2 mg/ml ATc in drinking water or normal water. After 11 days, the group of 10 mice infected with myoako and treated with ATc survived and the drug was removed. After 17 days, 5 mice were challenged with RH wild-type parasites and survived the infection. FIG. 4(b) shows the development/induction of T. gondii-specific T cells after infection with myoako1 mutants. At day 21 after infection, spleens were isolated and the development of T. gondii-specific T cells was determined by IFN-γ-ELISPOT. The mean of triplicates±is shown.

[0084]FIG. 5 shows the plasmid maps and respective nucleotide sequences for the vectors referred to in the Examples and FIGS. 1 to 4.

[0085] FIGS. 5(a) to (d) show the vectors used to conditionally express MyoA or GFP: FIG. 5(a) shows pTetO7Sag4-MyoA, FIG. 5(b) shows pTetO7Sag1-MyoA, FIG. 5(c) shows pTetO7Sag4-GFP, and FIG. 5(d) shows pTetO7Sag1-GFP.

[0086] FIGS. 5(e) to (g) show the vectors used to generate the recipient strain for the transactivator screening by random insertion: FIG. 5(e) shows pTetO7Sag1-HXGPRT, FIG. 5(f) shows pTetO7Sag1-LacZ, and FIG. 5(g) shows pTetO7Sag4-LacZ.

[0087]FIG. 5(h) shows the vector used for the random integration pTub8TetRsynthetic: Ptub8TetR-GCN5-DHFRTS. FIGS. 5(i) and 5(j) show the vectors used to express a functional transactivator: FIG. 5(i) shows pTub8TATi-1-HXGPRT, and FIG. 5(j) shows pTub8TATi-3-HXGPRT.

[0088]FIG. 6 shows the nucleic acid sequences and presumed amino acid sequences of TATi-1 and TATi-3.

EXAMPLE 1

[0089] Preparation of Tet-Inducible Transactivator Construct

[0090]T. gondii tachyzoites (RH hxgprf) were grown in human foreskin fibroblasts (HFF) and maintained in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% fetal calf serum (FCS), 2 mM glutamine and 25 μg/ml gentamicin. To generate stable transformants, 5.10⁷ freshly released RHhxgprf parasites were transfected and selected in presence of mycophenolic acid and xanthine (MPA/X) as previously described (Donald et al J. Biol. Chem. 271 14010-14019 (1996)). The selection based on chloramphenicol or pyrinmethamine resistance was achieved as described earlier (Soldat, D., & Boothroyd, J. C., Science 260 349-352 (1993); Donald, R. G. K., & Roos, D. S., Proc. Nat'l Acad. Sci. USA 90 11703-11707 (1993)). Homologous recombination was obtained as described previously (Reiss et al J. Cell. Biol. (2001)).

[0091] Plasmids

[0092] The reporter plasmids for the tet-Transactivator system, p7tetOHXGPRT, p7tetOLacZCAT were described previously (Meissner et al Nucleic Acids. Res. 29 e115 (2001)). For the construction of p7tetOS4mycMyoA the promoter region of p5RT70Tet4mycMyoA (Meissner et al Nucleic Acids. Res. 29 e115 (2001)) was exchanged by p7tetOS4 using NsiI/PacI. The resulting plasmid p7tetOS4mycMyoA was stably introduced by contransfection with pDHFRTs using DHFR-selection.

[0093] The construct pTgMyoA-kkoTCAT was composed of pT230 CAT previously described (Soldati, D. & Boothroyd, J. C., Mol. Cell. Biol. 15 87-93 (1995)) and flanked on both sides with 2.0 and 2.2 Kbp and of 5′- and 3′-flanking sequences of TgMyoA gene. The vector was linearised at both ends of the flanking sequences to rise the frequency of double homologous recombination.

[0094] The plasmid pTRep-DHFRTs was constructed in two steps. The TetR fragment was amplified with the oligonucleotides Rep-4 and Rep-7, which are:

[0095] Rep-4

[0096] 5′CGGAATTCCTTTTCGACAAAATGTCGCGCCTGGACAAGAGCAAAGTCATCAACTCTGC-3′

[0097] Rep-7

[0098] 5′CCCTTAATTAATGCATACCGCTTTCGCACTTCAGCTG-3′

[0099] The resulting PCR-fragment encoding TetR without a STOP-codon was cloned in the EcoRI/PacI sites of p5RT70GFP. In the second step, the TgDHFRTS selectable marker gene was inserted into the SacII-site.

[0100] The plasmid pTTATi-1-HX was constructed using the rescued RT-PCR fragment that was amplified using a poly-T-Primer with a BamH1-restriction site at the 3′-end and the tetR-specific primer Rep-4. The fragment was digested using EcoRI/BamHI and inserted between the same sites of p5RT70/HX. Recipient strain for the transactivator trapping was established by con-transformation of RHhxgprt⁻ with p7TetOS1LacZ-CAT and p7TetOHXGPRT. Stable parasites were selected using CAT. Integration of both plasmids was verified by analytical PCR.

[0101] Random insertion was used to identify a transactivating domain, which functions as tet-regulatable transactivator due to its fusion with the tetR. For this purpose, a construct expressing the tetR but lacking a stop codon was randomly integrated into T. gondii genome using the dehydrofolate reductase-thymidylate synthase (DHFR-TS) as selectable marker, which was previously shown to exhibit a very high frequency of integration (10⁻²). The recipient parasitic cell line used for the screening was deficient in HXGPRT gene and was transformed with two vectors expressing LacZ and HXGPRT under the control of a minimal promoter containing 7 tet-operator sequences (7tetO) (FIG. 1a). One of the parasite mutants resistant to mycophenolic acid and expressing LacZ in a tet-dependent fashion was characterised further. The tetR fusion was rescued by RT-PCR, cloned and sequenced. The fusion consisting of a 26 amino acids attached in the C-term of tetR was named TATi-1 (FIG. 1b). A new expression vector for TATi-1, pTTATi-1-HX, was constructed shown to confer tet-dependent LacZ expression when transiently transfected into a parasitic strain containing p7TetOLacZ. A stable line expressing TATi-1 using HXGPRT as selection was generated in RH to establish a recipient strain for the tet-system.

EXAMPLE 2

[0102] Plaque, Invasion and Egression Assays

[0103] Among many vital functions in obligate intracellular parasites, the process of host cell invasion is prerequisite for their survival and replication. Penetration into host cells is an active process dependent on parasite motion. Gliding motility has been shown previously demonstrated to require an intact actin cytoskeleton and to be powered by a myosin motor. The small unconventional myosin A (TgMyoA) is the primary candidate which localises beneath the plasma membrane (Heintzelman, M. B. & Schwartzman, J. D. J. Mol. Biol. 271 139-146 (1997); Hettmann et al Mol. Cell. Biol. 11 1385-1400 (2000)) and exhibits all the biochemical and biophysical properties necessary to generate fast movements. Consistent with an essential role for parasite survival, all our attempts to disrupt TgMyoA gene failed so far.

[0104] A second copy of TgMyoA under the control of the tet-promoter was introduced into the TATi-1 expressing cell line and monitored the modulation of TgMyoA transgene expression upon treatment of ATc by Western blot and indirect immunofluorescence (FIG. 2a, b). Using this robust inducible tet-system, we were able to disrupt the endogenous TgMyoA gene by homologous recombination with a vector carrying 5′- and 3′-TgMyoA flanking sequences respectively and chloramphenicol acetyltransferase as selectable marker. The absence of endogenous copy was determined by genomic PCR and by Western blot (FIG. 2c). The anticipated role of TgMyoA in parasite motion implied that parasites lacking the protein would be impaired in host cell invasion, egression and spreading. The phenotypic consequences of TgMyoA depletion could be best visualised in a plaque assay (FIG. 3a). After inoculation of HFF monolayers, the mutant parasites were cultivated with ATc over a period of 5-7 days showed an inability to form plaques in the HFF monolayers while non-treated parasites formed large plaques of lysis. This process was reversible upon removal of ATc (data not shown). The process of host cell penetration was examined specifically by invasion assay using freshly lysed parasites previously cultivated in presence or absence of ATc for 72 hours. The invasion rate was determined 24 hours later by counting the number of vacuoles (FIG. 3b). The number of parasites per vacuoles was identical for control and conditional myoako parasites, treated or not with ATc, indicating that the depletion in TgMyoA did not affect intracellular growth. Parasites use the same machinery to penetrate and egress host cells, so we performed an egression assay using the calcium ionophore for a short period of 5 min as previously described, followed by fixation and visualisation by IFA using anti-surface antigen 1 antibodies (SAG1) (FIG. 3c). Egression assay was performed as described previously (Black et al Mol. Cell. Biol. 20 9399-9408 (2000)). The regulated exocytosis by the apical organelles called micronemes plays a critical role in gliding motility and invasion. The micronemes secrete complexes of transmembrane and soluble adhesins upon raise in parasite intracellular calcium. These complexes are interacting with host cell receptors and their redistribution toward the posterior pole is driving gliding motion. To exclude any involvement of TgMyoA in microneme exocytosis, that would explain the phenotypes observed, we examined the discharge by micronemes. The secretion assay revealed no alteration of exocytosis upon TgMyoA depletion. (FIG. 3d).

EXAMPLE 3

[0105] Murine Virulence Assay

[0106] Freshly lysed tachyzoites from infected HFF monolayers were washed in PBS and counted under the microscope. Tachyzoites were injected intraperitoneally (i.p) in 0.2 ml into mice aged 6-8 weeks. The actual p.f.u, in the inoculum was determined by plaque assay. Mice on anydrotetracycline treatment were housed five per cage and were given 1 mg/ml solution instead of normal drinking water.

[0107] RH is a type I strain of T. gondii, which typically kills mice with a LD100 of a single infectious organism. The conditional myoako mutant was assessed for virulence in mice. Both wild type and mutant parasites were inoculated intraperitoneally in two groups of mice supplemented or not with ATc in the drinking water. After even days, all the mice infected with the control and the conditional myoako still expressing TgMyoA transgene were dead (FIG. 4a.) In contrast, we observed 100% survival in the group of mice infected with myoako but supplemented with ATc 11 days post-infection. All the mice were serologically positive for T. gondii as monitored by immunoblot (data not shown). At day 21-post infection, two mice were sacrificed and analysed for the presence of T. gondii-specific T-cells in the spleen, as determined by an IFN-γ-specific ELISPOT. The analysis was positive (FIG. 4b), indicating that the mice infected by myoako in the presence of ATc have been able to raise a detectable humoral and a cellular response against the parasites. To determine if these mice were protected against subsequent challenge of RH, they were innoculated i.p. with 150 parasites of RH strain at day 17 post infection. All the mice survived the challenge indicating that the myoako induced a protective immunity.

[0108] The tet-system described here has been selected in T. gondii and is not active in the Hela cells (data not shown). The activating domain is capable of interacting specifically with the transcription machinery of the parasite and might function as a transactivator in the closely related parasite Plasmodium falciparum.

[0109] While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this disclosure, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Accordingly, the scope of the invention should be limited only by the attached claims.

1 31 1 47 PRT Artificial Sequence Description of Artificial Sequence TetR-activating domain fusion 1 His Gln Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu Leu Ile Ile 1 5 10 15 Cys Gly Leu Glu Lys Pro Thr Phe Phe Asn Ser Gly Leu Leu Phe Gln 20 25 30 Thr Gly Thr Thr Leu Asn Pro Ile Ser Val Tyr Ser Phe Asp Leu 35 40 45 2 12 DNA Artificial Sequence Description of Artificial Sequence pTetO7Sag4-MyoA 2 ctgacgcgcc ct 12 3 12 DNA Artificial Sequence Description of Artificial Sequence pTetO7Sag4-MyoA 3 gaaaagtgcc ac 12 4 6423 DNA Artificial Sequence CDS (1270)..(3864) Description of Artificial Sequence pTetO7Sag4-MyoA 4 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga 60 ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct tcctttctcg 120 ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta gggttccgat 180 ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt tcacgtagtg 240 ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg ttctttaata 300 gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat tcttttgatt 360 tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat 420 ttaacgcgaa ttttaacaaa atattaacgc ttacaatttc cattcgccat tcaggctgcg 480 caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540 gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt cacgacgttg 600 taaaacgacg gccagtgagc gcgcgtaata cgactcacta tagggcgaat tgggtaccga 660 gctcgacttt cacttttctc tatcactgat agggagtggt aaactcgact ttcacttttc 720 tctatcactg atagggagtg gtaaactcga ctttcacttt tctctatcac tgatagggag 780 tggtaaactc gactttcact tttctctatc actgataggg agtggtaaac tcgactttca 840 cttttctcta tcactgatag ggagtggtaa actcgacttt cacttttctc tatcactgat 900 agggagtggt aaactcgact ttcacttttc tctatcactg atagggagtg gtaaactcga 960 ggtcgacggt atcgataagc ttacgccgct gagactaact agaaagaagt gtgcaacagt 1020 tcatgaggga caaaaggaat gtgatgcggt tcgcttgaag aaggaatgtt taagcacgtc 1080 aacaatacgc cttggcgaat gttcatgact gttcatgtgg ttcatcggat catttgaaaa 1140 catcgtgagg ctggtacctg gtcgcaaacg tcgtagtgta gtaccgacaa taacgtcgtc 1200 gttcaagggg acgcagttct cggaagacgc gtcgcagcat actgcaactg ctttcgtctg 1260 tcttcaacc atg cat gag cag aag ctc atc tcc gag gag gac ctg ctg cat 1311 Met His Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu Leu His 1 5 10 cat cat cat cat cat cat gat ggt acc gag ctc gcg agc aag acc acg 1359 His His His His His His Asp Gly Thr Glu Leu Ala Ser Lys Thr Thr 15 20 25 30 tct gag gag ctg aaa acg gcc acg gcg ctg aag aag agg tcg tcc gat 1407 Ser Glu Glu Leu Lys Thr Ala Thr Ala Leu Lys Lys Arg Ser Ser Asp 35 40 45 gtc cac gcg gtc gac cac tcc ggc aat gtg tac aaa gga ttt caa atc 1455 Val His Ala Val Asp His Ser Gly Asn Val Tyr Lys Gly Phe Gln Ile 50 55 60 tgg acg gac ttg gcg ccg tcg gtg aag gag gag ccg gac ctg atg ttt 1503 Trp Thr Asp Leu Ala Pro Ser Val Lys Glu Glu Pro Asp Leu Met Phe 65 70 75 gcc aag tgc atc gtg cag gcg ggg aca gac aag ggg aac ttg acc tgc 1551 Ala Lys Cys Ile Val Gln Ala Gly Thr Asp Lys Gly Asn Leu Thr Cys 80 85 90 gtc cag atc gat cca ccg ggc ttc gac gaa ccg ttc gaa gtc ccg cag 1599 Val Gln Ile Asp Pro Pro Gly Phe Asp Glu Pro Phe Glu Val Pro Gln 95 100 105 110 gcg aat gcg tgg aac gta aac agc ctg atc gac ccc atg acg tac gga 1647 Ala Asn Ala Trp Asn Val Asn Ser Leu Ile Asp Pro Met Thr Tyr Gly 115 120 125 gac atc ggc atg ttg cct cac acg aac att cct tgc gtc ctc gac ttc 1695 Asp Ile Gly Met Leu Pro His Thr Asn Ile Pro Cys Val Leu Asp Phe 130 135 140 ctc aag gtg cgc ttc atg aag aat caa atc tac acg act gcg gac ccg 1743 Leu Lys Val Arg Phe Met Lys Asn Gln Ile Tyr Thr Thr Ala Asp Pro 145 150 155 ctc gtc gtc gcc atc aat ccc ttc cgc gac ctc ggg aac acc acg ctc 1791 Leu Val Val Ala Ile Asn Pro Phe Arg Asp Leu Gly Asn Thr Thr Leu 160 165 170 gac tgg att gtt cga tac aga gac act ttc gac ctc tcc aaa ctc gcg 1839 Asp Trp Ile Val Arg Tyr Arg Asp Thr Phe Asp Leu Ser Lys Leu Ala 175 180 185 190 ccc cat gtt ttc tac acc gcc cga cgc gcg ctc gac aac ctc cac gcc 1887 Pro His Val Phe Tyr Thr Ala Arg Arg Ala Leu Asp Asn Leu His Ala 195 200 205 gtc aac aag tcg caa acg atc atc gtg tcc ggt gag tct ggc gcg ggc 1935 Val Asn Lys Ser Gln Thr Ile Ile Val Ser Gly Glu Ser Gly Ala Gly 210 215 220 aag acg gag gcg acg aag cag att atg agg tat ttt gcg gcg gcg aag 1983 Lys Thr Glu Ala Thr Lys Gln Ile Met Arg Tyr Phe Ala Ala Ala Lys 225 230 235 acg ggg tcg atg gat ttg cgg att cag aac gcg atc atg gcg gcg aat 2031 Thr Gly Ser Met Asp Leu Arg Ile Gln Asn Ala Ile Met Ala Ala Asn 240 245 250 cca gtg ctt gag gca ttt gga aat gcg aag acg att cgc aac aac aac 2079 Pro Val Leu Glu Ala Phe Gly Asn Ala Lys Thr Ile Arg Asn Asn Asn 255 260 265 270 tcg tcg cgt ttc gga cgc ttc atg cag ctg gat gtg ggt cgc gaa gga 2127 Ser Ser Arg Phe Gly Arg Phe Met Gln Leu Asp Val Gly Arg Glu Gly 275 280 285 ggc atc aag ttt ggc tcc gtc gtc gcc ttt ctc ctg gaa aag tcg cgt 2175 Gly Ile Lys Phe Gly Ser Val Val Ala Phe Leu Leu Glu Lys Ser Arg 290 295 300 gtt ctc acg cag gac gaa cag gag cgg tcg tac cac atc ttc tac caa 2223 Val Leu Thr Gln Asp Glu Gln Glu Arg Ser Tyr His Ile Phe Tyr Gln 305 310 315 atg tgc aag ggg gcg gac gcg gcg atg aag gag cgc ttc cat atc ctg 2271 Met Cys Lys Gly Ala Asp Ala Ala Met Lys Glu Arg Phe His Ile Leu 320 325 330 ccg ctc tcg gag tac aag tac atc aat ccg ttg tgc ctg gac gcg cca 2319 Pro Leu Ser Glu Tyr Lys Tyr Ile Asn Pro Leu Cys Leu Asp Ala Pro 335 340 345 350 ggg atc gac gac gtc gcg gag ttc cac gaa gtc tgc gag tcg ttc cgg 2367 Gly Ile Asp Asp Val Ala Glu Phe His Glu Val Cys Glu Ser Phe Arg 355 360 365 tcg atg aat ctg acg gag gac gaa gtc gcg agc gtg tgg agc atc gtg 2415 Ser Met Asn Leu Thr Glu Asp Glu Val Ala Ser Val Trp Ser Ile Val 370 375 380 agt gga gtg ctg ctg ctt ggc aac gtc gag gtg aca gcg acg aag gat 2463 Ser Gly Val Leu Leu Leu Gly Asn Val Glu Val Thr Ala Thr Lys Asp 385 390 395 ggg ggg atc gac gac gcc gcg gcg atc gag ggg aag aac ttg gag gtt 2511 Gly Gly Ile Asp Asp Ala Ala Ala Ile Glu Gly Lys Asn Leu Glu Val 400 405 410 ttc aaa aag gcc tgc ggg ctg ctc ttc ctc gac gcg gag cgc att cgc 2559 Phe Lys Lys Ala Cys Gly Leu Leu Phe Leu Asp Ala Glu Arg Ile Arg 415 420 425 430 gaa gag ctg acg gtg aag gtt tcg tat gcg ggg aat cag gag atc cgc 2607 Glu Glu Leu Thr Val Lys Val Ser Tyr Ala Gly Asn Gln Glu Ile Arg 435 440 445 ggc cgg tgg aag cag gaa gac gga gac atg ctc aag tcg tcg ctc gcg 2655 Gly Arg Trp Lys Gln Glu Asp Gly Asp Met Leu Lys Ser Ser Leu Ala 450 455 460 aag gcg atg tac gac aag ttg ttc atg tgg atc att gcc gtg ttg aac 2703 Lys Ala Met Tyr Asp Lys Leu Phe Met Trp Ile Ile Ala Val Leu Asn 465 470 475 cgc agc atc aag cct ccg ggc ggc ttc aag atc ttc atg ggc atg ctc 2751 Arg Ser Ile Lys Pro Pro Gly Gly Phe Lys Ile Phe Met Gly Met Leu 480 485 490 gac atc ttc ggc ttc gaa gtc ttc aag aac aac tcg ctg gag cag ttc 2799 Asp Ile Phe Gly Phe Glu Val Phe Lys Asn Asn Ser Leu Glu Gln Phe 495 500 505 510 ttc atc aac atc acg aac gaa atg ctg cag aag aac ttc gtc gac atc 2847 Phe Ile Asn Ile Thr Asn Glu Met Leu Gln Lys Asn Phe Val Asp Ile 515 520 525 gtc ttc gac cgc gag agc aag ctg tat cgt gac gag ggt gtc tcc tcc 2895 Val Phe Asp Arg Glu Ser Lys Leu Tyr Arg Asp Glu Gly Val Ser Ser 530 535 540 aag gag ttg att ttc acc tcg aac gca gaa gtg atc aag atc ttg acg 2943 Lys Glu Leu Ile Phe Thr Ser Asn Ala Glu Val Ile Lys Ile Leu Thr 545 550 555 gcg aag aac aac tcg gtg ctc gct gcg ctc gag gac cag tgc ctc gcc 2991 Ala Lys Asn Asn Ser Val Leu Ala Ala Leu Glu Asp Gln Cys Leu Ala 560 565 570 cct gga ggc agc gac gaa aag ttc ctc tcg acc tgc aag aac gcg ctg 3039 Pro Gly Gly Ser Asp Glu Lys Phe Leu Ser Thr Cys Lys Asn Ala Leu 575 580 585 590 aaa gga acc acc aag ttc aag cct gcg aag gtc tct ccg aac atc aat 3087 Lys Gly Thr Thr Lys Phe Lys Pro Ala Lys Val Ser Pro Asn Ile Asn 595 600 605 ttc ctc atc tcg cac act gtc ggc gac atc cag tac aac gcc gaa ggc 3135 Phe Leu Ile Ser His Thr Val Gly Asp Ile Gln Tyr Asn Ala Glu Gly 610 615 620 ttc ctc ttc aaa aac aaa gat gtc ctg cga gca gaa atc atg gaa atc 3183 Phe Leu Phe Lys Asn Lys Asp Val Leu Arg Ala Glu Ile Met Glu Ile 625 630 635 gtg cag caa agc aag aac ccc gtt gtc gcg caa ctc ttc gct ggc atc 3231 Val Gln Gln Ser Lys Asn Pro Val Val Ala Gln Leu Phe Ala Gly Ile 640 645 650 gtc atg gag aag ggg aag atg gcc aag gga caa ctg att ggg tcg cag 3279 Val Met Glu Lys Gly Lys Met Ala Lys Gly Gln Leu Ile Gly Ser Gln 655 660 665 670 ttc ctc tcg cag ctg cag agc ctc atg gaa ctt atc aac agc acc gag 3327 Phe Leu Ser Gln Leu Gln Ser Leu Met Glu Leu Ile Asn Ser Thr Glu 675 680 685 cct cac ttc att cgc tgc atc aag ccg aac gac acg aag aag ccc ctc 3375 Pro His Phe Ile Arg Cys Ile Lys Pro Asn Asp Thr Lys Lys Pro Leu 690 695 700 gac tgg gtg ccg tcg aaa atg ctc att cag ctg cac gcg ctc tcc gtc 3423 Asp Trp Val Pro Ser Lys Met Leu Ile Gln Leu His Ala Leu Ser Val 705 710 715 ctc gag gct ctt cag ctc cgt caa ctc ggc tac tct tac aga cgt ccg 3471 Leu Glu Ala Leu Gln Leu Arg Gln Leu Gly Tyr Ser Tyr Arg Arg Pro 720 725 730 ttc aag gag ttc ctc ttc cag ttc aag ttt atc gac ctc tcg gct tct 3519 Phe Lys Glu Phe Leu Phe Gln Phe Lys Phe Ile Asp Leu Ser Ala Ser 735 740 745 750 gaa aat cca aat ctg gac ccc aaa gaa gct gcg ctg aga ctc ctc aaa 3567 Glu Asn Pro Asn Leu Asp Pro Lys Glu Ala Ala Leu Arg Leu Leu Lys 755 760 765 agc agc aaa ctg ccc agc gaa gaa tac cag ctc ggg aag aca atg gtt 3615 Ser Ser Lys Leu Pro Ser Glu Glu Tyr Gln Leu Gly Lys Thr Met Val 770 775 780 ttc ctc aag cag acg ggc gcg aaa gaa ctg acg cag att cag aga gaa 3663 Phe Leu Lys Gln Thr Gly Ala Lys Glu Leu Thr Gln Ile Gln Arg Glu 785 790 795 tgc ctt tct tct tgg gag cct ctc gtc tca gtg ctc gag gcg tac tac 3711 Cys Leu Ser Ser Trp Glu Pro Leu Val Ser Val Leu Glu Ala Tyr Tyr 800 805 810 gct ggc aga cgc cac aag aag cag ctg ctg aaa aag acc ccc ttc atc 3759 Ala Gly Arg Arg His Lys Lys Gln Leu Leu Lys Lys Thr Pro Phe Ile 815 820 825 830 att cgc gcc cag gct cac atc cgc aga cac ctg gtg gac aac aac gtc 3807 Ile Arg Ala Gln Ala His Ile Arg Arg His Leu Val Asp Asn Asn Val 835 840 845 agc ccc gcg act gtt cag ccg gcg ttc gga tcc act cga gat gca ggc 3855 Ser Pro Ala Thr Val Gln Pro Ala Phe Gly Ser Thr Arg Asp Ala Gly 850 855 860 ggt gct taa ttaatcaccg ttgtgctcac ttctcaaatc gacaaaggaa 3904 Gly Ala acacacttcg tgcagcatgt gccccattat aaagaaactg agttgttccg ctgtggcttg 3964 caggtgtcac atccacaaaa accggccgac tctaaatagg agtgtttcgc agcaagcagc 4024 gaaagtttat gactgggtcc gaatctctga acggatgtgt ggcggacctg gctgatgttg 4084 atcgccgtcg acacacgcgc cacatgggtc aatacacaag acagctatca gttgttttag 4144 tcgaaccggt taacacaatt cttgcccccc cgagggggat ccactagttc tagagcggcc 4204 gccaccgcgg tggagctcca gcttttgttc cctttagtga gggttaattg cgcgcttggc 4264 gtaatcatgg tcatagctgt ttcctgtgtg aaattgttat ccgctcacaa ttccacacaa 4324 catacgagcc ggaagcataa agtgtaaagc ctggggtgcc taatgagtga gctaactcac 4384 attaattgcg ttgcgctcac tgcccgcttt ccagtcggga aacctgtcgt gccagctgca 4444 ttaatgaatc ggccaacgcg cggggagagg cggtttgcgt attgggcgct cttccgcttc 4504 ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat cagctcactc 4564 aaaggcggta atacggttat ccacagaatc aggggataac gcaggaaaga acatgtgagc 4624 aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag 4684 gctccgcccc cctgacgagc atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc 4744 gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt 4804 tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct 4864 ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg 4924 ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta actatcgtct 4984 tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg gtaacaggat 5044 tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc ctaactacgg 5104 ctacactaga aggacagtat ttggtatctg cgctctgctg aagccagtta ccttcggaaa 5164 aagagttggt agctcttgat ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt 5224 ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc 5284 tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa gggattttgg tcatgagatt 5344 atcaaaaagg atcttcacct agatcctttt aaattaaaaa tgaagtttta aatcaatcta 5404 aagtatatat gagtaaactt ggtctgacag ttaccaatgc ttaatcagtg aggcacctat 5464 ctcagcgatc tgtctatttc gttcatccat agttgcctga ctccccgtcg tgtagataac 5524 tacgatacgg gagggcttac catctggccc cagtgctgca atgataccgc gagacccacg 5584 ctcaccggct ccagatttat cagcaataaa ccagccagcc ggaagggccg agcgcagaag 5644 tggtcctgca actttatccg cctccatcca gtctattaat tgttgccggg aagctagagt 5704 aagtagttcg ccagttaata gtttgcgcaa cgttgttgcc attgctacag gcatcgtggt 5764 gtcacgctcg tcgtttggta tggcttcatt cagctccggt tcccaacgat caaggcgagt 5824 tacatgatcc cccatgttgt gcaaaaaagc ggttagctcc ttcggtcctc cgatcgttgt 5884 cagaagtaag ttggccgcag tgttatcact catggttatg gcagcactgc ataattctct 5944 tactgtcatg ccatccgtaa gatgcttttc tgtgactggt gagtactcaa ccaagtcatt 6004 ctgagaatag tgtatgcggc gaccgagttg ctcttgcccg gcgtcaatac gggataatac 6064 cgcgccacat agcagaactt taaaagtgct catcattgga aaacgttctt cggggcgaaa 6124 actctcaagg atcttaccgc tgttgagatc cagttcgatg taacccactc gtgcacccaa 6184 ctgatcttca gcatctttta ctttcaccag cgtttctggg tgagcaaaaa caggaaggca 6244 aaatgccgca aaaaagggaa taagggcgac acggaaatgt tgaatactca tactcttcct 6304 ttttcaatat tattgaagca tttatcaggg ttattgtctc atgagcggat acatatttga 6364 atgtatttag aaaaataaac aaataggggt tccgcgcaca tttccccgaa aagtgccac 6423 5 864 PRT Artificial Sequence Description of Artificial Sequence pTetO7Sag4-MyoA 5 Met His Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu Leu His His His 1 5 10 15 His His His His Asp Gly Thr Glu Leu Ala Ser Lys Thr Thr Ser Glu 20 25 30 Glu Leu Lys Thr Ala Thr Ala Leu Lys Lys Arg Ser Ser Asp Val His 35 40 45 Ala Val Asp His Ser Gly Asn Val Tyr Lys Gly Phe Gln Ile Trp Thr 50 55 60 Asp Leu Ala Pro Ser Val Lys Glu Glu Pro Asp Leu Met Phe Ala Lys 65 70 75 80 Cys Ile Val Gln Ala Gly Thr Asp Lys Gly Asn Leu Thr Cys Val Gln 85 90 95 Ile Asp Pro Pro Gly Phe Asp Glu Pro Phe Glu Val Pro Gln Ala Asn 100 105 110 Ala Trp Asn Val Asn Ser Leu Ile Asp Pro Met Thr Tyr Gly Asp Ile 115 120 125 Gly Met Leu Pro His Thr Asn Ile Pro Cys Val Leu Asp Phe Leu Lys 130 135 140 Val Arg Phe Met Lys Asn Gln Ile Tyr Thr Thr Ala Asp Pro Leu Val 145 150 155 160 Val Ala Ile Asn Pro Phe Arg Asp Leu Gly Asn Thr Thr Leu Asp Trp 165 170 175 Ile Val Arg Tyr Arg Asp Thr Phe Asp Leu Ser Lys Leu Ala Pro His 180 185 190 Val Phe Tyr Thr Ala Arg Arg Ala Leu Asp Asn Leu His Ala Val Asn 195 200 205 Lys Ser Gln Thr Ile Ile Val Ser Gly Glu Ser Gly Ala Gly Lys Thr 210 215 220 Glu Ala Thr Lys Gln Ile Met Arg Tyr Phe Ala Ala Ala Lys Thr Gly 225 230 235 240 Ser Met Asp Leu Arg Ile Gln Asn Ala Ile Met Ala Ala Asn Pro Val 245 250 255 Leu Glu Ala Phe Gly Asn Ala Lys Thr Ile Arg Asn Asn Asn Ser Ser 260 265 270 Arg Phe Gly Arg Phe Met Gln Leu Asp Val Gly Arg Glu Gly Gly Ile 275 280 285 Lys Phe Gly Ser Val Val Ala Phe Leu Leu Glu Lys Ser Arg Val Leu 290 295 300 Thr Gln Asp Glu Gln Glu Arg Ser Tyr His Ile Phe Tyr Gln Met Cys 305 310 315 320 Lys Gly Ala Asp Ala Ala Met Lys Glu Arg Phe His Ile Leu Pro Leu 325 330 335 Ser Glu Tyr Lys Tyr Ile Asn Pro Leu Cys Leu Asp Ala Pro Gly Ile 340 345 350 Asp Asp Val Ala Glu Phe His Glu Val Cys Glu Ser Phe Arg Ser Met 355 360 365 Asn Leu Thr Glu Asp Glu Val Ala Ser Val Trp Ser Ile Val Ser Gly 370 375 380 Val Leu Leu Leu Gly Asn Val Glu Val Thr Ala Thr Lys Asp Gly Gly 385 390 395 400 Ile Asp Asp Ala Ala Ala Ile Glu Gly Lys Asn Leu Glu Val Phe Lys 405 410 415 Lys Ala Cys Gly Leu Leu Phe Leu Asp Ala Glu Arg Ile Arg Glu Glu 420 425 430 Leu Thr Val Lys Val Ser Tyr Ala Gly Asn Gln Glu Ile Arg Gly Arg 435 440 445 Trp Lys Gln Glu Asp Gly Asp Met Leu Lys Ser Ser Leu Ala Lys Ala 450 455 460 Met Tyr Asp Lys Leu Phe Met Trp Ile Ile Ala Val Leu Asn Arg Ser 465 470 475 480 Ile Lys Pro Pro Gly Gly Phe Lys Ile Phe Met Gly Met Leu Asp Ile 485 490 495 Phe Gly Phe Glu Val Phe Lys Asn Asn Ser Leu Glu Gln Phe Phe Ile 500 505 510 Asn Ile Thr Asn Glu Met Leu Gln Lys Asn Phe Val Asp Ile Val Phe 515 520 525 Asp Arg Glu Ser Lys Leu Tyr Arg Asp Glu Gly Val Ser Ser Lys Glu 530 535 540 Leu Ile Phe Thr Ser Asn Ala Glu Val Ile Lys Ile Leu Thr Ala Lys 545 550 555 560 Asn Asn Ser Val Leu Ala Ala Leu Glu Asp Gln Cys Leu Ala Pro Gly 565 570 575 Gly Ser Asp Glu Lys Phe Leu Ser Thr Cys Lys Asn Ala Leu Lys Gly 580 585 590 Thr Thr Lys Phe Lys Pro Ala Lys Val Ser Pro Asn Ile Asn Phe Leu 595 600 605 Ile Ser His Thr Val Gly Asp Ile Gln Tyr Asn Ala Glu Gly Phe Leu 610 615 620 Phe Lys Asn Lys Asp Val Leu Arg Ala Glu Ile Met Glu Ile Val Gln 625 630 635 640 Gln Ser Lys Asn Pro Val Val Ala Gln Leu Phe Ala Gly Ile Val Met 645 650 655 Glu Lys Gly Lys Met Ala Lys Gly Gln Leu Ile Gly Ser Gln Phe Leu 660 665 670 Ser Gln Leu Gln Ser Leu Met Glu Leu Ile Asn Ser Thr Glu Pro His 675 680 685 Phe Ile Arg Cys Ile Lys Pro Asn Asp Thr Lys Lys Pro Leu Asp Trp 690 695 700 Val Pro Ser Lys Met Leu Ile Gln Leu His Ala Leu Ser Val Leu Glu 705 710 715 720 Ala Leu Gln Leu Arg Gln Leu Gly Tyr Ser Tyr Arg Arg Pro Phe Lys 725 730 735 Glu Phe Leu Phe Gln Phe Lys Phe Ile Asp Leu Ser Ala Ser Glu Asn 740 745 750 Pro Asn Leu Asp Pro Lys Glu Ala Ala Leu Arg Leu Leu Lys Ser Ser 755 760 765 Lys Leu Pro Ser Glu Glu Tyr Gln Leu Gly Lys Thr Met Val Phe Leu 770 775 780 Lys Gln Thr Gly Ala Lys Glu Leu Thr Gln Ile Gln Arg Glu Cys Leu 785 790 795 800 Ser Ser Trp Glu Pro Leu Val Ser Val Leu Glu Ala Tyr Tyr Ala Gly 805 810 815 Arg Arg His Lys Lys Gln Leu Leu Lys Lys Thr Pro Phe Ile Ile Arg 820 825 830 Ala Gln Ala His Ile Arg Arg His Leu Val Asp Asn Asn Val Ser Pro 835 840 845 Ala Thr Val Gln Pro Ala Phe Gly Ser Thr Arg Asp Ala Gly Gly Ala 850 855 860 6 6346 DNA Artificial Sequence CDS (1193)..(3787) misc_feature (1102) n is disclosed as an asterisk 6 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga 60 ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct tcctttctcg 120 ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta gggttccgat 180 ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt tcacgtagtg 240 ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg ttctttaata 300 gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat tcttttgatt 360 tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat 420 ttaacgcgaa ttttaacaaa atattaacgc ttacaatttc cattcgccat tcaggctgcg 480 caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540 gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt cacgacgttg 600 taaaacgacg gccagtgagc gcgcgtaata cgactcacta tagggcgaat tgggtaccga 660 gctcgacttt cacttttctc tatcactgat agggagtggt aaactcgact ttcacttttc 720 tctatcactg atagggagtg gtaaactcga ctttcacttt tctctatcac tgatagggag 780 tggtaaactc gactttcact tttctctatc actgataggg agtggtaaac tcgactttca 840 cttttctcta tcactgatag ggagtggtaa actcgacttt cacttttctc tatcactgat 900 agggagtggt aaactcgact ttcacttttc tctatcactg atagggagtg gtaaactcga 960 ggtcgacggt atcgataagc ttcaatgtgc acctgtagga agctgtagtc actgctgatt 1020 ctcactgttc tcggcaaggg ccgacgaccg gagtacagtt tttgtgggca gagccgttgt 1080 gcagctttcc gttcttctcg gntgtgtcac atgtgtcatt gtcgtgtaaa cacacggttg 1140 gatgtcggtt tcgctgcacc acttcattat ttcttctggt tttttgacga gt atg cat 1198 Met His 1 gag cag aag ctc atc tcc gag gag gac ctg ctg cat cat cat cat cat 1246 Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu Leu His His His His His 5 10 15 cat cat gat ggt acc gag ctc gcg agc aag acc acg tct gag gag ctg 1294 His His Asp Gly Thr Glu Leu Ala Ser Lys Thr Thr Ser Glu Glu Leu 20 25 30 aaa acg gcc acg gcg ctg aag aag agg tcg tcc gat gtc cac gcg gtc 1342 Lys Thr Ala Thr Ala Leu Lys Lys Arg Ser Ser Asp Val His Ala Val 35 40 45 50 gac cac tcc ggc aat gtg tac aaa gga ttt caa atc tgg acg gac ttg 1390 Asp His Ser Gly Asn Val Tyr Lys Gly Phe Gln Ile Trp Thr Asp Leu 55 60 65 gcg ccg tcg gtg aag gag gag ccg gac ctg atg ttt gcc aag tgc atc 1438 Ala Pro Ser Val Lys Glu Glu Pro Asp Leu Met Phe Ala Lys Cys Ile 70 75 80 gtg cag gcg ggg aca gac aag ggg aac ttg acc tgc gtc cag atc gat 1486 Val Gln Ala Gly Thr Asp Lys Gly Asn Leu Thr Cys Val Gln Ile Asp 85 90 95 cca ccg ggc ttc gac gaa ccg ttc gaa gtc ccg cag gcg aat gcg tgg 1534 Pro Pro Gly Phe Asp Glu Pro Phe Glu Val Pro Gln Ala Asn Ala Trp 100 105 110 aac gta aac agc ctg atc gac ccc atg acg tac gga gac atc ggc atg 1582 Asn Val Asn Ser Leu Ile Asp Pro Met Thr Tyr Gly Asp Ile Gly Met 115 120 125 130 ttg cct cac acg aac att cct tgc gtc ctc gac ttc ctc aag gtg cgc 1630 Leu Pro His Thr Asn Ile Pro Cys Val Leu Asp Phe Leu Lys Val Arg 135 140 145 ttc atg aag aat caa atc tac acg act gcg gac ccg ctc gtc gtc gcc 1678 Phe Met Lys Asn Gln Ile Tyr Thr Thr Ala Asp Pro Leu Val Val Ala 150 155 160 atc aat ccc ttc cgc gac ctc ggg aac acc acg ctc gac tgg att gtt 1726 Ile Asn Pro Phe Arg Asp Leu Gly Asn Thr Thr Leu Asp Trp Ile Val 165 170 175 cga tac aga gac act ttc gac ctc tcc aaa ctc gcg ccc cat gtt ttc 1774 Arg Tyr Arg Asp Thr Phe Asp Leu Ser Lys Leu Ala Pro His Val Phe 180 185 190 tac acc gcc cga cgc gcg ctc gac aac ctc cac gcc gtc aac aag tcg 1822 Tyr Thr Ala Arg Arg Ala Leu Asp Asn Leu His Ala Val Asn Lys Ser 195 200 205 210 caa acg atc atc gtg tcc ggt gag tct ggc gcg ggc aag acg gag gcg 1870 Gln Thr Ile Ile Val Ser Gly Glu Ser Gly Ala Gly Lys Thr Glu Ala 215 220 225 acg aag cag att atg agg tat ttt gcg gcg gcg aag acg ggg tcg atg 1918 Thr Lys Gln Ile Met Arg Tyr Phe Ala Ala Ala Lys Thr Gly Ser Met 230 235 240 gat ttg cgg att cag aac gcg atc atg gcg gcg aat cca gtg ctt gag 1966 Asp Leu Arg Ile Gln Asn Ala Ile Met Ala Ala Asn Pro Val Leu Glu 245 250 255 gca ttt gga aat gcg aag acg att cgc aac aac aac tcg tcg cgt ttc 2014 Ala Phe Gly Asn Ala Lys Thr Ile Arg Asn Asn Asn Ser Ser Arg Phe 260 265 270 gga cgc ttc atg cag ctg gat gtg ggt cgc gaa gga ggc atc aag ttt 2062 Gly Arg Phe Met Gln Leu Asp Val Gly Arg Glu Gly Gly Ile Lys Phe 275 280 285 290 ggc tcc gtc gtc gcc ttt ctc ctg gaa aag tcg cgt gtt ctc acg cag 2110 Gly Ser Val Val Ala Phe Leu Leu Glu Lys Ser Arg Val Leu Thr Gln 295 300 305 gac gaa cag gag cgg tcg tac cac atc ttc tac caa atg tgc aag ggg 2158 Asp Glu Gln Glu Arg Ser Tyr His Ile Phe Tyr Gln Met Cys Lys Gly 310 315 320 gcg gac gcg gcg atg aag gag cgc ttc cat atc ctg ccg ctc tcg gag 2206 Ala Asp Ala Ala Met Lys Glu Arg Phe His Ile Leu Pro Leu Ser Glu 325 330 335 tac aag tac atc aat ccg ttg tgc ctg gac gcg cca ggg atc gac gac 2254 Tyr Lys Tyr Ile Asn Pro Leu Cys Leu Asp Ala Pro Gly Ile Asp Asp 340 345 350 gtc gcg gag ttc cac gaa gtc tgc gag tcg ttc cgg tcg atg aat ctg 2302 Val Ala Glu Phe His Glu Val Cys Glu Ser Phe Arg Ser Met Asn Leu 355 360 365 370 acg gag gac gaa gtc gcg agc gtg tgg agc atc gtg agt gga gtg ctg 2350 Thr Glu Asp Glu Val Ala Ser Val Trp Ser Ile Val Ser Gly Val Leu 375 380 385 ctg ctt ggc aac gtc gag gtg aca gcg acg aag gat ggg ggg atc gac 2398 Leu Leu Gly Asn Val Glu Val Thr Ala Thr Lys Asp Gly Gly Ile Asp 390 395 400 gac gcc gcg gcg atc gag ggg aag aac ttg gag gtt ttc aaa aag gcc 2446 Asp Ala Ala Ala Ile Glu Gly Lys Asn Leu Glu Val Phe Lys Lys Ala 405 410 415 tgc ggg ctg ctc ttc ctc gac gcg gag cgc att cgc gaa gag ctg acg 2494 Cys Gly Leu Leu Phe Leu Asp Ala Glu Arg Ile Arg Glu Glu Leu Thr 420 425 430 gtg aag gtt tcg tat gcg ggg aat cag gag atc cgc ggc cgg tgg aag 2542 Val Lys Val Ser Tyr Ala Gly Asn Gln Glu Ile Arg Gly Arg Trp Lys 435 440 445 450 cag gaa gac gga gac atg ctc aag tcg tcg ctc gcg aag gcg atg tac 2590 Gln Glu Asp Gly Asp Met Leu Lys Ser Ser Leu Ala Lys Ala Met Tyr 455 460 465 gac aag ttg ttc atg tgg atc att gcc gtg ttg aac cgc agc atc aag 2638 Asp Lys Leu Phe Met Trp Ile Ile Ala Val Leu Asn Arg Ser Ile Lys 470 475 480 cct ccg ggc ggc ttc aag atc ttc atg ggc atg ctc gac atc ttc ggc 2686 Pro Pro Gly Gly Phe Lys Ile Phe Met Gly Met Leu Asp Ile Phe Gly 485 490 495 ttc gaa gtc ttc aag aac aac tcg ctg gag cag ttc ttc atc aac atc 2734 Phe Glu Val Phe Lys Asn Asn Ser Leu Glu Gln Phe Phe Ile Asn Ile 500 505 510 acg aac gaa atg ctg cag aag aac ttc gtc gac atc gtc ttc gac cgc 2782 Thr Asn Glu Met Leu Gln Lys Asn Phe Val Asp Ile Val Phe Asp Arg 515 520 525 530 gag agc aag ctg tat cgt gac gag ggt gtc tcc tcc aag gag ttg att 2830 Glu Ser Lys Leu Tyr Arg Asp Glu Gly Val Ser Ser Lys Glu Leu Ile 535 540 545 ttc acc tcg aac gca gaa gtg atc aag atc ttg acg gcg aag aac aac 2878 Phe Thr Ser Asn Ala Glu Val Ile Lys Ile Leu Thr Ala Lys Asn Asn 550 555 560 tcg gtg ctc gct gcg ctc gag gac cag tgc ctc gcc cct gga ggc agc 2926 Ser Val Leu Ala Ala Leu Glu Asp Gln Cys Leu Ala Pro Gly Gly Ser 565 570 575 gac gaa aag ttc ctc tcg acc tgc aag aac gcg ctg aaa gga acc acc 2974 Asp Glu Lys Phe Leu Ser Thr Cys Lys Asn Ala Leu Lys Gly Thr Thr 580 585 590 aag ttc aag cct gcg aag gtc tct ccg aac atc aat ttc ctc atc tcg 3022 Lys Phe Lys Pro Ala Lys Val Ser Pro Asn Ile Asn Phe Leu Ile Ser 595 600 605 610 cac act gtc ggc gac atc cag tac aac gcc gaa ggc ttc ctc ttc aaa 3070 His Thr Val Gly Asp Ile Gln Tyr Asn Ala Glu Gly Phe Leu Phe Lys 615 620 625 aac aaa gat gtc ctg cga gca gaa atc atg gaa atc gtg cag caa agc 3118 Asn Lys Asp Val Leu Arg Ala Glu Ile Met Glu Ile Val Gln Gln Ser 630 635 640 aag aac ccc gtt gtc gcg caa ctc ttc gct ggc atc gtc atg gag aag 3166 Lys Asn Pro Val Val Ala Gln Leu Phe Ala Gly Ile Val Met Glu Lys 645 650 655 ggg aag atg gcc aag gga caa ctg att ggg tcg cag ttc ctc tcg cag 3214 Gly Lys Met Ala Lys Gly Gln Leu Ile Gly Ser Gln Phe Leu Ser Gln 660 665 670 ctg cag agc ctc atg gaa ctt atc aac agc acc gag cct cac ttc att 3262 Leu Gln Ser Leu Met Glu Leu Ile Asn Ser Thr Glu Pro His Phe Ile 675 680 685 690 cgc tgc atc aag ccg aac gac acg aag aag ccc ctc gac tgg gtg ccg 3310 Arg Cys Ile Lys Pro Asn Asp Thr Lys Lys Pro Leu Asp Trp Val Pro 695 700 705 tcg aaa atg ctc att cag ctg cac gcg ctc tcc gtc ctc gag gct ctt 3358 Ser Lys Met Leu Ile Gln Leu His Ala Leu Ser Val Leu Glu Ala Leu 710 715 720 cag ctc cgt caa ctc ggc tac tct tac aga cgt ccg ttc aag gag ttc 3406 Gln Leu Arg Gln Leu Gly Tyr Ser Tyr Arg Arg Pro Phe Lys Glu Phe 725 730 735 ctc ttc cag ttc aag ttt atc gac ctc tcg gct tct gaa aat cca aat 3454 Leu Phe Gln Phe Lys Phe Ile Asp Leu Ser Ala Ser Glu Asn Pro Asn 740 745 750 ctg gac ccc aaa gaa gct gcg ctg aga ctc ctc aaa agc agc aaa ctg 3502 Leu Asp Pro Lys Glu Ala Ala Leu Arg Leu Leu Lys Ser Ser Lys Leu 755 760 765 770 ccc agc gaa gaa tac cag ctc ggg aag aca atg gtt ttc ctc aag cag 3550 Pro Ser Glu Glu Tyr Gln Leu Gly Lys Thr Met Val Phe Leu Lys Gln 775 780 785 acg ggc gcg aaa gaa ctg acg cag att cag aga gaa tgc ctt tct tct 3598 Thr Gly Ala Lys Glu Leu Thr Gln Ile Gln Arg Glu Cys Leu Ser Ser 790 795 800 tgg gag cct ctc gtc tca gtg ctc gag gcg tac tac gct ggc aga cgc 3646 Trp Glu Pro Leu Val Ser Val Leu Glu Ala Tyr Tyr Ala Gly Arg Arg 805 810 815 cac aag aag cag ctg ctg aaa aag acc ccc ttc atc att cgc gcc cag 3694 His Lys Lys Gln Leu Leu Lys Lys Thr Pro Phe Ile Ile Arg Ala Gln 820 825 830 gct cac atc cgc aga cac ctg gtg gac aac aac gtc agc ccc gcg act 3742 Ala His Ile Arg Arg His Leu Val Asp Asn Asn Val Ser Pro Ala Thr 835 840 845 850 gtt cag ccg gcg ttc gga tcc act cga gat gca ggc ggt gct taa 3787 Val Gln Pro Ala Phe Gly Ser Thr Arg Asp Ala Gly Gly Ala 855 860 ttaatcaccg ttgtgctcac ttctcaaatc gacaaaggaa acacacttcg tgcagcatgt 3847 gccccattat aaagaaactg agttgttccg ctgtggcttg caggtgtcac atccacaaaa 3907 accggccgac tctaaatagg agtgtttcgc agcaagcagc gaaagtttat gactgggtcc 3967 gaatctctga acggatgtgt ggcggacctg gctgatgttg atcgccgtcg acacacgcgc 4027 cacatgggtc aatacacaag acagctatca gttgttttag tcgaaccggt taacacaatt 4087 cttgcccccc cgagggggat ccactagttc tagagcggcc gccaccgcgg tggagctcca 4147 gcttttgttc cctttagtga gggttaattg cgcgcttggc gtaatcatgg tcatagctgt 4207 ttcctgtgtg aaattgttat ccgctcacaa ttccacacaa catacgagcc ggaagcataa 4267 agtgtaaagc ctggggtgcc taatgagtga gctaactcac attaattgcg ttgcgctcac 4327 tgcccgcttt ccagtcggga aacctgtcgt gccagctgca ttaatgaatc ggccaacgcg 4387 cggggagagg cggtttgcgt attgggcgct cttccgcttc ctcgctcact gactcgctgc 4447 gctcggtcgt tcggctgcgg cgagcggtat cagctcactc aaaggcggta atacggttat 4507 ccacagaatc aggggataac gcaggaaaga acatgtgagc aaaaggccag caaaaggcca 4567 ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag gctccgcccc cctgacgagc 4627 atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc gacaggacta taaagatacc 4687 aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg 4747 gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct ttctcatagc tcacgctgta 4807 ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg 4867 ttcagcccga ccgctgcgcc ttatccggta actatcgtct tgagtccaac ccggtaagac 4927 acgacttatc gccactggca gcagccactg gtaacaggat tagcagagcg aggtatgtag 4987 gcggtgctac agagttcttg aagtggtggc ctaactacgg ctacactaga aggacagtat 5047 ttggtatctg cgctctgctg aagccagtta ccttcggaaa aagagttggt agctcttgat 5107 ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc 5167 gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc tacggggtct gacgctcagt 5227 ggaacgaaaa ctcacgttaa gggattttgg tcatgagatt atcaaaaagg atcttcacct 5287 agatcctttt aaattaaaaa tgaagtttta aatcaatcta aagtatatat gagtaaactt 5347 ggtctgacag ttaccaatgc ttaatcagtg aggcacctat ctcagcgatc tgtctatttc 5407 gttcatccat agttgcctga ctccccgtcg tgtagataac tacgatacgg gagggcttac 5467 catctggccc cagtgctgca atgataccgc gagacccacg ctcaccggct ccagatttat 5527 cagcaataaa ccagccagcc ggaagggccg agcgcagaag tggtcctgca actttatccg 5587 cctccatcca gtctattaat tgttgccggg aagctagagt aagtagttcg ccagttaata 5647 gtttgcgcaa cgttgttgcc attgctacag gcatcgtggt gtcacgctcg tcgtttggta 5707 tggcttcatt cagctccggt tcccaacgat caaggcgagt tacatgatcc cccatgttgt 5767 gcaaaaaagc ggttagctcc ttcggtcctc cgatcgttgt cagaagtaag ttggccgcag 5827 tgttatcact catggttatg gcagcactgc ataattctct tactgtcatg ccatccgtaa 5887 gatgcttttc tgtgactggt gagtactcaa ccaagtcatt ctgagaatag tgtatgcggc 5947 gaccgagttg ctcttgcccg gcgtcaatac gggataatac cgcgccacat agcagaactt 6007 taaaagtgct catcattgga aaacgttctt cggggcgaaa actctcaagg atcttaccgc 6067 tgttgagatc cagttcgatg taacccactc gtgcacccaa ctgatcttca gcatctttta 6127 ctttcaccag cgtttctggg tgagcaaaaa caggaaggca aaatgccgca aaaaagggaa 6187 taagggcgac acggaaatgt tgaatactca tactcttcct ttttcaatat tattgaagca 6247 tttatcaggg ttattgtctc atgagcggat acatatttga atgtatttag aaaaataaac 6307 aaataggggt tccgcgcaca tttccccgaa aagtgccac 6346 7 864 PRT Artificial Sequence Description of Artificial Sequence pTetO7Sag1-MyoA 7 Met His Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu Leu His His His 1 5 10 15 His His His His Asp Gly Thr Glu Leu Ala Ser Lys Thr Thr Ser Glu 20 25 30 Glu Leu Lys Thr Ala Thr Ala Leu Lys Lys Arg Ser Ser Asp Val His 35 40 45 Ala Val Asp His Ser Gly Asn Val Tyr Lys Gly Phe Gln Ile Trp Thr 50 55 60 Asp Leu Ala Pro Ser Val Lys Glu Glu Pro Asp Leu Met Phe Ala Lys 65 70 75 80 Cys Ile Val Gln Ala Gly Thr Asp Lys Gly Asn Leu Thr Cys Val Gln 85 90 95 Ile Asp Pro Pro Gly Phe Asp Glu Pro Phe Glu Val Pro Gln Ala Asn 100 105 110 Ala Trp Asn Val Asn Ser Leu Ile Asp Pro Met Thr Tyr Gly Asp Ile 115 120 125 Gly Met Leu Pro His Thr Asn Ile Pro Cys Val Leu Asp Phe Leu Lys 130 135 140 Val Arg Phe Met Lys Asn Gln Ile Tyr Thr Thr Ala Asp Pro Leu Val 145 150 155 160 Val Ala Ile Asn Pro Phe Arg Asp Leu Gly Asn Thr Thr Leu Asp Trp 165 170 175 Ile Val Arg Tyr Arg Asp Thr Phe Asp Leu Ser Lys Leu Ala Pro His 180 185 190 Val Phe Tyr Thr Ala Arg Arg Ala Leu Asp Asn Leu His Ala Val Asn 195 200 205 Lys Ser Gln Thr Ile Ile Val Ser Gly Glu Ser Gly Ala Gly Lys Thr 210 215 220 Glu Ala Thr Lys Gln Ile Met Arg Tyr Phe Ala Ala Ala Lys Thr Gly 225 230 235 240 Ser Met Asp Leu Arg Ile Gln Asn Ala Ile Met Ala Ala Asn Pro Val 245 250 255 Leu Glu Ala Phe Gly Asn Ala Lys Thr Ile Arg Asn Asn Asn Ser Ser 260 265 270 Arg Phe Gly Arg Phe Met Gln Leu Asp Val Gly Arg Glu Gly Gly Ile 275 280 285 Lys Phe Gly Ser Val Val Ala Phe Leu Leu Glu Lys Ser Arg Val Leu 290 295 300 Thr Gln Asp Glu Gln Glu Arg Ser Tyr His Ile Phe Tyr Gln Met Cys 305 310 315 320 Lys Gly Ala Asp Ala Ala Met Lys Glu Arg Phe His Ile Leu Pro Leu 325 330 335 Ser Glu Tyr Lys Tyr Ile Asn Pro Leu Cys Leu Asp Ala Pro Gly Ile 340 345 350 Asp Asp Val Ala Glu Phe His Glu Val Cys Glu Ser Phe Arg Ser Met 355 360 365 Asn Leu Thr Glu Asp Glu Val Ala Ser Val Trp Ser Ile Val Ser Gly 370 375 380 Val Leu Leu Leu Gly Asn Val Glu Val Thr Ala Thr Lys Asp Gly Gly 385 390 395 400 Ile Asp Asp Ala Ala Ala Ile Glu Gly Lys Asn Leu Glu Val Phe Lys 405 410 415 Lys Ala Cys Gly Leu Leu Phe Leu Asp Ala Glu Arg Ile Arg Glu Glu 420 425 430 Leu Thr Val Lys Val Ser Tyr Ala Gly Asn Gln Glu Ile Arg Gly Arg 435 440 445 Trp Lys Gln Glu Asp Gly Asp Met Leu Lys Ser Ser Leu Ala Lys Ala 450 455 460 Met Tyr Asp Lys Leu Phe Met Trp Ile Ile Ala Val Leu Asn Arg Ser 465 470 475 480 Ile Lys Pro Pro Gly Gly Phe Lys Ile Phe Met Gly Met Leu Asp Ile 485 490 495 Phe Gly Phe Glu Val Phe Lys Asn Asn Ser Leu Glu Gln Phe Phe Ile 500 505 510 Asn Ile Thr Asn Glu Met Leu Gln Lys Asn Phe Val Asp Ile Val Phe 515 520 525 Asp Arg Glu Ser Lys Leu Tyr Arg Asp Glu Gly Val Ser Ser Lys Glu 530 535 540 Leu Ile Phe Thr Ser Asn Ala Glu Val Ile Lys Ile Leu Thr Ala Lys 545 550 555 560 Asn Asn Ser Val Leu Ala Ala Leu Glu Asp Gln Cys Leu Ala Pro Gly 565 570 575 Gly Ser Asp Glu Lys Phe Leu Ser Thr Cys Lys Asn Ala Leu Lys Gly 580 585 590 Thr Thr Lys Phe Lys Pro Ala Lys Val Ser Pro Asn Ile Asn Phe Leu 595 600 605 Ile Ser His Thr Val Gly Asp Ile Gln Tyr Asn Ala Glu Gly Phe Leu 610 615 620 Phe Lys Asn Lys Asp Val Leu Arg Ala Glu Ile Met Glu Ile Val Gln 625 630 635 640 Gln Ser Lys Asn Pro Val Val Ala Gln Leu Phe Ala Gly Ile Val Met 645 650 655 Glu Lys Gly Lys Met Ala Lys Gly Gln Leu Ile Gly Ser Gln Phe Leu 660 665 670 Ser Gln Leu Gln Ser Leu Met Glu Leu Ile Asn Ser Thr Glu Pro His 675 680 685 Phe Ile Arg Cys Ile Lys Pro Asn Asp Thr Lys Lys Pro Leu Asp Trp 690 695 700 Val Pro Ser Lys Met Leu Ile Gln Leu His Ala Leu Ser Val Leu Glu 705 710 715 720 Ala Leu Gln Leu Arg Gln Leu Gly Tyr Ser Tyr Arg Arg Pro Phe Lys 725 730 735 Glu Phe Leu Phe Gln Phe Lys Phe Ile Asp Leu Ser Ala Ser Glu Asn 740 745 750 Pro Asn Leu Asp Pro Lys Glu Ala Ala Leu Arg Leu Leu Lys Ser Ser 755 760 765 Lys Leu Pro Ser Glu Glu Tyr Gln Leu Gly Lys Thr Met Val Phe Leu 770 775 780 Lys Gln Thr Gly Ala Lys Glu Leu Thr Gln Ile Gln Arg Glu Cys Leu 785 790 795 800 Ser Ser Trp Glu Pro Leu Val Ser Val Leu Glu Ala Tyr Tyr Ala Gly 805 810 815 Arg Arg His Lys Lys Gln Leu Leu Lys Lys Thr Pro Phe Ile Ile Arg 820 825 830 Ala Gln Ala His Ile Arg Arg His Leu Val Asp Asn Asn Val Ser Pro 835 840 845 Ala Thr Val Gln Pro Ala Phe Gly Ser Thr Arg Asp Ala Gly Gly Ala 850 855 860 8 4556 DNA Artificial Sequence CDS (1270)..(2001) Description of Artificial Sequence pTetO7Sag4-GFP 8 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga 60 ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct tcctttctcg 120 ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta gggttccgat 180 ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt tcacgtagtg 240 ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg ttctttaata 300 gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat tcttttgatt 360 tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat 420 ttaacgcgaa ttttaacaaa atattaacgc ttacaatttc cattcgccat tcaggctgcg 480 caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540 gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt cacgacgttg 600 taaaacgacg gccagtgagc gcgcgtaata cgactcacta tagggcgaat tgggtaccga 660 gctcgacttt cacttttctc tatcactgat agggagtggt aaactcgact ttcacttttc 720 tctatcactg atagggagtg gtaaactcga ctttcacttt tctctatcac tgatagggag 780 tggtaaactc gactttcact tttctctatc actgataggg agtggtaaac tcgactttca 840 cttttctcta tcactgatag ggagtggtaa actcgacttt cacttttctc tatcactgat 900 agggagtggt aaactcgact ttcacttttc tctatcactg atagggagtg gtaaactcga 960 ggtcgacggt atcgataagc ttacgccgct gagactaact agaaagaagt gtgcaacagt 1020 tcatgaggga caaaaggaat gtgatgcggt tcgcttgaag aaggaatgtt taagcacgtc 1080 aacaatacgc cttggcgaat gttcatgact gttcatgtgg ttcatcggat catttgaaaa 1140 catcgtgagg ctggtacctg gtcgcaaacg tcgtagtgta gtaccgacaa taacgtcgtc 1200 gttcaagggg acgcagttct cggaagacgc gtcgcagcat actgcaactg ctttcgtctg 1260 tcttcaacc atg cat agt aaa gga gaa gaa ctt ttc act gga gtt gtc cca 1311 Met His Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro 1 5 10 att ctt gtt gaa tta gat ggt gat gtt aat ggg cac aaa ttt tct gtc 1359 Ile Leu Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val 15 20 25 30 agt gga gag ggt gaa ggt gat gca aca tac gga aaa ctt acc ctt aaa 1407 Ser Gly Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys 35 40 45 ttt att tgc act act gga aaa cta cct gtt cca tgg cca aca ctt gtc 1455 Phe Ile Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val 50 55 60 act act ttc tct tat ggt gtt caa tgc ttt tca aga tac cca gat cat 1503 Thr Thr Phe Ser Tyr Gly Val Gln Cys Phe Ser Arg Tyr Pro Asp His 65 70 75 atg aag cgg cac gac ttc ttc aag agc gcc atg cct gag gga tac gtg 1551 Met Lys Arg His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val 80 85 90 cag gag agg acc atc ttc ttc aag gac gac ggg aac tac aag aca cgt 1599 Gln Glu Arg Thr Ile Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg 95 100 105 110 gct gaa gtc aag ttt gag gga gac acc ctc gtc aac agg atc gag ctt 1647 Ala Glu Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu 115 120 125 aag gga atc gat ttc aag gag gac gga aac atc ctc ggc cac aag ttg 1695 Lys Gly Ile Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu 130 135 140 gaa tac aac tac aac tcc cac aac gta tac atc atg gcc gac aag caa 1743 Glu Tyr Asn Tyr Asn Ser His Asn Val Tyr Ile Met Ala Asp Lys Gln 145 150 155 aag aac ggc atc aaa gcc aac ttc aag acc cgc cac aac atc gaa gac 1791 Lys Asn Gly Ile Lys Ala Asn Phe Lys Thr Arg His Asn Ile Glu Asp 160 165 170 ggc ggc gtg caa ctc gct gat cat tat caa caa aat act cca att ggc 1839 Gly Gly Val Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly 175 180 185 190 gat ggc cct gtc ctt tta cca gac aac cat tac ctg tcc aca caa tct 1887 Asp Gly Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser 195 200 205 gcc ctt tcg aaa gat ccc aac gaa aag aga gac cac atg gtc ctt ctt 1935 Ala Leu Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu 210 215 220 gag ttt gta aca gct gct ggg att aca cat ggc atg gat gaa cta tac 1983 Glu Phe Val Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr 225 230 235 aaa gct gca gtt aat taa tcaccgttgt gctcacttct caaatcgaca 2031 Lys Ala Ala Val Asn 240 aaggaaacac acttcgtgca gcatgtgccc cattataaag aaactgagtt gttccgctgt 2091 ggcttgcagg tgtcacatcc acaaaaaccg gccgactcta aataggagtg tttcgcagca 2151 agcagcgaaa gtttatgact gggtccgaat ctctgaacgg atgtgtggcg gacctggctg 2211 atgttgatcg ccgtcgacac acgcgccaca tgggtcaata cacaagacag ctatcagttg 2271 ttttagtcga accggttaac acaattcttg cccccccgag ggggatccac tagttctaga 2331 gcggccgcca ccgcggtgga gctccagctt ttgttccctt tagtgagggt taattgcgcg 2391 cttggcgtaa tcatggtcat agctgtttcc tgtgtgaaat tgttatccgc tcacaattcc 2451 acacaacata cgagccggaa gcataaagtg taaagcctgg ggtgcctaat gagtgagcta 2511 actcacatta attgcgttgc gctcactgcc cgctttccag tcgggaaacc tgtcgtgcca 2571 gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg ggcgctcttc 2631 cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc 2691 tcactcaaag gcggtaatac ggttatccac agaatcaggg gataacgcag gaaagaacat 2751 gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt 2811 ccataggctc cgcccccctg acgagcatca caaaaatcga cgctcaagtc agaggtggcg 2871 aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc tcgtgcgctc 2931 tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt cgggaagcgt 2991 ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa 3051 gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat ccggtaacta 3111 tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag ccactggtaa 3171 caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa 3231 ctacggctac actagaagga cagtatttgg tatctgcgct ctgctgaagc cagttacctt 3291 cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt 3351 ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag atcctttgat 3411 cttttctacg gggtctgacg ctcagtggaa cgaaaactca cgttaaggga ttttggtcat 3471 gagattatca aaaaggatct tcacctagat ccttttaaat taaaaatgaa gttttaaatc 3531 aatctaaagt atatatgagt aaacttggtc tgacagttac caatgcttaa tcagtgaggc 3591 acctatctca gcgatctgtc tatttcgttc atccatagtt gcctgactcc ccgtcgtgta 3651 gataactacg atacgggagg gcttaccatc tggccccagt gctgcaatga taccgcgaga 3711 cccacgctca ccggctccag atttatcagc aataaaccag ccagccggaa gggccgagcg 3771 cagaagtggt cctgcaactt tatccgcctc catccagtct attaattgtt gccgggaagc 3831 tagagtaagt agttcgccag ttaatagttt gcgcaacgtt gttgccattg ctacaggcat 3891 cgtggtgtca cgctcgtcgt ttggtatggc ttcattcagc tccggttccc aacgatcaag 3951 gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt agctccttcg gtcctccgat 4011 cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg gttatggcag cactgcataa 4071 ttctcttact gtcatgccat ccgtaagatg cttttctgtg actggtgagt actcaaccaa 4131 gtcattctga gaatagtgta tgcggcgacc gagttgctct tgcccggcgt caatacggga 4191 taataccgcg ccacatagca gaactttaaa agtgctcatc attggaaaac gttcttcggg 4251 gcgaaaactc tcaaggatct taccgctgtt gagatccagt tcgatgtaac ccactcgtgc 4311 acccaactga tcttcagcat cttttacttt caccagcgtt tctgggtgag caaaaacagg 4371 aaggcaaaat gccgcaaaaa agggaataag ggcgacacgg aaatgttgaa tactcatact 4431 cttccttttt caatattatt gaagcattta tcagggttat tgtctcatga gcggatacat 4491 atttgaatgt atttagaaaa ataaacaaat aggggttccg cgcacatttc cccgaaaagt 4551 gccac 4556 9 243 PRT Artificial Sequence Description of Artificial Sequence pTetO7Sag4-GFP 9 Met His Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu 1 5 10 15 Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly 20 25 30 Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile 35 40 45 Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr 50 55 60 Phe Ser Tyr Gly Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys 65 70 75 80 Arg His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu 85 90 95 Arg Thr Ile Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu 100 105 110 Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly 115 120 125 Ile Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr 130 135 140 Asn Tyr Asn Ser His Asn Val Tyr Ile Met Ala Asp Lys Gln Lys Asn 145 150 155 160 Gly Ile Lys Ala Asn Phe Lys Thr Arg His Asn Ile Glu Asp Gly Gly 165 170 175 Val Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly 180 185 190 Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Ala Leu 195 200 205 Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe 210 215 220 Val Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys Ala 225 230 235 240 Ala Val Asn 10 4479 DNA Artificial Sequence CDS (1193)..(1924) misc_feature (1102) n is disclosed as an asterisk 10 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga 60 ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct tcctttctcg 120 ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta gggttccgat 180 ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt tcacgtagtg 240 ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg ttctttaata 300 gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat tcttttgatt 360 tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat 420 ttaacgcgaa ttttaacaaa atattaacgc ttacaatttc cattcgccat tcaggctgcg 480 caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540 gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt cacgacgttg 600 taaaacgacg gccagtgagc gcgcgtaata cgactcacta tagggcgaat tgggtaccga 660 gctcgacttt cacttttctc tatcactgat agggagtggt aaactcgact ttcacttttc 720 tctatcactg atagggagtg gtaaactcga ctttcacttt tctctatcac tgatagggag 780 tggtaaactc gactttcact tttctctatc actgataggg agtggtaaac tcgactttca 840 cttttctcta tcactgatag ggagtggtaa actcgacttt cacttttctc tatcactgat 900 agggagtggt aaactcgact ttcacttttc tctatcactg atagggagtg gtaaactcga 960 ggtcgacggt atcgataagc ttcaatgtgc acctgtagga agctgtagtc actgctgatt 1020 ctcactgttc tcggcaaggg ccgacgaccg gagtacagtt tttgtgggca gagccgttgt 1080 gcagctttcc gttcttctcg gntgtgtcac atgtgtcatt gtcgtgtaaa cacacggttg 1140 gatgtcggtt tcgctgcacc acttcattat ttcttctggt tttttgacga gt atg cat 1198 Met His 1 agt aaa gga gaa gaa ctt ttc act gga gtt gtc cca att ctt gtt gaa 1246 Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val Glu 5 10 15 tta gat ggt gat gtt aat ggg cac aaa ttt tct gtc agt gga gag ggt 1294 Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu Gly 20 25 30 gaa ggt gat gca aca tac gga aaa ctt acc ctt aaa ttt att tgc act 1342 Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile Cys Thr 35 40 45 50 act gga aaa cta cct gtt cca tgg cca aca ctt gtc act act ttc tct 1390 Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe Ser 55 60 65 tat ggt gtt caa tgc ttt tca aga tac cca gat cat atg aag cgg cac 1438 Tyr Gly Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys Arg His 70 75 80 gac ttc ttc aag agc gcc atg cct gag gga tac gtg cag gag agg acc 1486 Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg Thr 85 90 95 atc ttc ttc aag gac gac ggg aac tac aag aca cgt gct gaa gtc aag 1534 Ile Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val Lys 100 105 110 ttt gag gga gac acc ctc gtc aac agg atc gag ctt aag gga atc gat 1582 Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp 115 120 125 130 ttc aag gag gac gga aac atc ctc ggc cac aag ttg gaa tac aac tac 1630 Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn Tyr 135 140 145 aac tcc cac aac gta tac atc atg gcc gac aag caa aag aac ggc atc 1678 Asn Ser His Asn Val Tyr Ile Met Ala Asp Lys Gln Lys Asn Gly Ile 150 155 160 aaa gcc aac ttc aag acc cgc cac aac atc gaa gac ggc ggc gtg caa 1726 Lys Ala Asn Phe Lys Thr Arg His Asn Ile Glu Asp Gly Gly Val Gln 165 170 175 ctc gct gat cat tat caa caa aat act cca att ggc gat ggc cct gtc 1774 Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro Val 180 185 190 ctt tta cca gac aac cat tac ctg tcc aca caa tct gcc ctt tcg aaa 1822 Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Ala Leu Ser Lys 195 200 205 210 gat ccc aac gaa aag aga gac cac atg gtc ctt ctt gag ttt gta aca 1870 Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe Val Thr 215 220 225 gct gct ggg att aca cat ggc atg gat gaa cta tac aaa gct gca gtt 1918 Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys Ala Ala Val 230 235 240 aat taa tcaccgttgt gctcacttct caaatcgaca aaggaaacac acttcgtgca 1974 Asn gcatgtgccc cattataaag aaactgagtt gttccgctgt ggcttgcagg tgtcacatcc 2034 acaaaaaccg gccgactcta aataggagtg tttcgcagca agcagcgaaa gtttatgact 2094 gggtccgaat ctctgaacgg atgtgtggcg gacctggctg atgttgatcg ccgtcgacac 2154 acgcgccaca tgggtcaata cacaagacag ctatcagttg ttttagtcga accggttaac 2214 acaattcttg cccccccgag ggggatccac tagttctaga gcggccgcca ccgcggtgga 2274 gctccagctt ttgttccctt tagtgagggt taattgcgcg cttggcgtaa tcatggtcat 2334 agctgtttcc tgtgtgaaat tgttatccgc tcacaattcc acacaacata cgagccggaa 2394 gcataaagtg taaagcctgg ggtgcctaat gagtgagcta actcacatta attgcgttgc 2454 gctcactgcc cgctttccag tcgggaaacc tgtcgtgcca gctgcattaa tgaatcggcc 2514 aacgcgcggg gagaggcggt ttgcgtattg ggcgctcttc cgcttcctcg ctcactgact 2574 cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag gcggtaatac 2634 ggttatccac agaatcaggg gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa 2694 aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc cgcccccctg 2754 acgagcatca caaaaatcga cgctcaagtc agaggtggcg aaacccgaca ggactataaa 2814 gataccaggc gtttccccct ggaagctccc tcgtgcgctc tcctgttccg accctgccgc 2874 ttaccggata cctgtccgcc tttctccctt cgggaagcgt ggcgctttct catagctcac 2934 gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac 2994 cccccgttca gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag tccaacccgg 3054 taagacacga cttatcgcca ctggcagcag ccactggtaa caggattagc agagcgaggt 3114 atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa ctacggctac actagaagga 3174 cagtatttgg tatctgcgct ctgctgaagc cagttacctt cggaaaaaga gttggtagct 3234 cttgatccgg caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga 3294 ttacgcgcag aaaaaaagga tctcaagaag atcctttgat cttttctacg gggtctgacg 3354 ctcagtggaa cgaaaactca cgttaaggga ttttggtcat gagattatca aaaaggatct 3414 tcacctagat ccttttaaat taaaaatgaa gttttaaatc aatctaaagt atatatgagt 3474 aaacttggtc tgacagttac caatgcttaa tcagtgaggc acctatctca gcgatctgtc 3534 tatttcgttc atccatagtt gcctgactcc ccgtcgtgta gataactacg atacgggagg 3594 gcttaccatc tggccccagt gctgcaatga taccgcgaga cccacgctca ccggctccag 3654 atttatcagc aataaaccag ccagccggaa gggccgagcg cagaagtggt cctgcaactt 3714 tatccgcctc catccagtct attaattgtt gccgggaagc tagagtaagt agttcgccag 3774 ttaatagttt gcgcaacgtt gttgccattg ctacaggcat cgtggtgtca cgctcgtcgt 3834 ttggtatggc ttcattcagc tccggttccc aacgatcaag gcgagttaca tgatccccca 3894 tgttgtgcaa aaaagcggtt agctccttcg gtcctccgat cgttgtcaga agtaagttgg 3954 ccgcagtgtt atcactcatg gttatggcag cactgcataa ttctcttact gtcatgccat 4014 ccgtaagatg cttttctgtg actggtgagt actcaaccaa gtcattctga gaatagtgta 4074 tgcggcgacc gagttgctct tgcccggcgt caatacggga taataccgcg ccacatagca 4134 gaactttaaa agtgctcatc attggaaaac gttcttcggg gcgaaaactc tcaaggatct 4194 taccgctgtt gagatccagt tcgatgtaac ccactcgtgc acccaactga tcttcagcat 4254 cttttacttt caccagcgtt tctgggtgag caaaaacagg aaggcaaaat gccgcaaaaa 4314 agggaataag ggcgacacgg aaatgttgaa tactcatact cttccttttt caatattatt 4374 gaagcattta tcagggttat tgtctcatga gcggatacat atttgaatgt atttagaaaa 4434 ataaacaaat aggggttccg cgcacatttc cccgaaaagt gccac 4479 11 243 PRT Artificial Sequence Description of Artificial Sequence pTetO7Sag1-GFP 11 Met His Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu 1 5 10 15 Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly 20 25 30 Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile 35 40 45 Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr 50 55 60 Phe Ser Tyr Gly Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys 65 70 75 80 Arg His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu 85 90 95 Arg Thr Ile Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu 100 105 110 Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly 115 120 125 Ile Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr 130 135 140 Asn Tyr Asn Ser His Asn Val Tyr Ile Met Ala Asp Lys Gln Lys Asn 145 150 155 160 Gly Ile Lys Ala Asn Phe Lys Thr Arg His Asn Ile Glu Asp Gly Gly 165 170 175 Val Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly 180 185 190 Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Ala Leu 195 200 205 Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe 210 215 220 Val Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys Ala 225 230 235 240 Ala Val Asn 12 4438 DNA Artificial Sequence CDS (1193)..(1885) Description of Artificial Sequence pTetO7Sag1-HXGPRT 12 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga 60 ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct tcctttctcg 120 ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta gggttccgat 180 ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt tcacgtagtg 240 ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg ttctttaata 300 gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat tcttttgatt 360 tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat 420 ttaacgcgaa ttttaacaaa atattaacgc ttacaatttc cattcgccat tcaggctgcg 480 caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540 gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt cacgacgttg 600 taaaacgacg gccagtgagc gcgcgtaata cgactcacta tagggcgaat tgggtaccga 660 gctcgacttt cacttttctc tatcactgat agggagtggt aaactcgact ttcacttttc 720 tctatcactg atagggagtg gtaaactcga ctttcacttt tctctatcac tgatagggag 780 tggtaaactc gactttcact tttctctatc actgataggg agtggtaaac tcgactttca 840 cttttctcta tcactgatag ggagtggtaa actcgacttt cacttttctc tatcactgat 900 agggagtggt aaactcgact ttcacttttc tctatcactg atagggagtg gtaaactcga 960 ggtcgacggt atcgataagc ttcaatgtgc acctgtagga agctgtagtc actgctgatt 1020 ctcactgttc tcggcaaggg ccgacgaccg gagtacagtt tttgtgggca gagccgttgt 1080 gcagctttcc gttcttctcg gttgtgtcac atgtgtcatt gtcgtgtaaa cacacggttg 1140 tatgtcggtt tcgctgcacc acttcattat ttcttctggt tttttgacga gt atg cat 1198 Met His 1 gcg tcc aaa ccc att gaa gac tac ggc aag ggc aag ggc cgt att gag 1246 Ala Ser Lys Pro Ile Glu Asp Tyr Gly Lys Gly Lys Gly Arg Ile Glu 5 10 15 ccc atg tat atc ccc gac aac acc ttc tac aac gct gat gac ttt ctt 1294 Pro Met Tyr Ile Pro Asp Asn Thr Phe Tyr Asn Ala Asp Asp Phe Leu 20 25 30 gtg ccc ccc cac tgc aag ccc tac att gac aaa atc ctc ctc cct ggt 1342 Val Pro Pro His Cys Lys Pro Tyr Ile Asp Lys Ile Leu Leu Pro Gly 35 40 45 50 gga ttg gtc aag gac aga gtt gag aag ttg gcg tat gac atc cac aga 1390 Gly Leu Val Lys Asp Arg Val Glu Lys Leu Ala Tyr Asp Ile His Arg 55 60 65 act tac ttc ggc gag gag ttg cac atc att tgc atc ctg aaa ggc tct 1438 Thr Tyr Phe Gly Glu Glu Leu His Ile Ile Cys Ile Leu Lys Gly Ser 70 75 80 cgc ggc ttc ttc aac ctt ctg atc gac tac ctt gcc acc ata cag aat 1486 Arg Gly Phe Phe Asn Leu Leu Ile Asp Tyr Leu Ala Thr Ile Gln Asn 85 90 95 ggt cgt gag tcc agc gtg ccc ccc ttc ttc gag cac tat gtc cgc ctg 1534 Gly Arg Glu Ser Ser Val Pro Pro Phe Phe Glu His Tyr Val Arg Leu 100 105 110 aag tcc tac cag aac gac aac agc aca ggc cag ctc acc gtc ttg agc 1582 Lys Ser Tyr Gln Asn Asp Asn Ser Thr Gly Gln Leu Thr Val Leu Ser 115 120 125 130 gac gac ttg tca atc ttt cgc gac aag cac gtt ctg att gtt gag gac 1630 Asp Asp Leu Ser Ile Phe Arg Asp Lys His Val Leu Ile Val Glu Asp 135 140 145 atc gtc gac acc ggt ttc acc ctc acc gag ttc ggt gag cgc ctg aaa 1678 Ile Val Asp Thr Gly Phe Thr Leu Thr Glu Phe Gly Glu Arg Leu Lys 150 155 160 gcc gtc ggt ccc aag tcg atg aga atc gcc acc ctc gtc gag aag cgc 1726 Ala Val Gly Pro Lys Ser Met Arg Ile Ala Thr Leu Val Glu Lys Arg 165 170 175 aca gat cgc tcc aac agc ttg aag ggc gac ttc gtc ggc ttc agc att 1774 Thr Asp Arg Ser Asn Ser Leu Lys Gly Asp Phe Val Gly Phe Ser Ile 180 185 190 gaa gac gtc tgg atc gtt ggt tgc tgc tac gac ttc aac gag atg ttc 1822 Glu Asp Val Trp Ile Val Gly Cys Cys Tyr Asp Phe Asn Glu Met Phe 195 200 205 210 cgc gac ttc gac cac gtc gcc gtc ctg agc gac gcc gct cgc aaa aag 1870 Arg Asp Phe Asp His Val Ala Val Leu Ser Asp Ala Ala Arg Lys Lys 215 220 225 ttc gag aag gct taa ttaatcaccg ttgtgctcac ttctcaaatc gacaaaggaa 1925 Phe Glu Lys Ala 230 acacacttcg tgcagcatgt gccccattat aaagaaactg agttgttccg ctgtggcttg 1985 caggtgtcac atccacaaaa accggccgac tctaaatagg agtgtttcgc agcaagcagc 2045 gaaagtttat gactgggtcc gaatctctga acggatgtgt ggcggacctg gctgatgttg 2105 atcgccgtcg acacacgcgc cacatgggtc aatacacaag acagctatca gttgttttag 2165 tcgaaccggt taacacaatt cttgcccccc cgagggggat ccactagagc ggccgccacc 2225 gcggtggagc tccagctttt gttcccttta gtgagggtta attgcgcgct tggcgtaatc 2285 atggtcatag ctgtttcctg tgtgaaattg ttatccgctc acaattccac acaacatacg 2345 agccggaagc ataaagtgta aagcctgggg tgcctaatga gtgagctaac tcacattaat 2405 tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg tcgtgccagc tgcattaatg 2465 aatcggccaa cgcgcgggga gaggcggttt gcgtattggg cgctcttccg cttcctcgct 2525 cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg gtatcagctc actcaaaggc 2585 ggtaatacgg ttatccacag aatcagggga taacgcagga aagaacatgt gagcaaaagg 2645 ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg gcgtttttcc ataggctccg 2705 cccccctgac gagcatcaca aaaatcgacg ctcaagtcag aggtggcgaa acccgacagg 2765 actataaaga taccaggcgt ttccccctgg aagctccctc gtgcgctctc ctgttccgac 2825 cctgccgctt accggatacc tgtccgcctt tctcccttcg ggaagcgtgg cgctttctca 2885 tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc tgggctgtgt 2945 gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc ggtaactatc gtcttgagtc 3005 caacccggta agacacgact tatcgccact ggcagcagcc actggtaaca ggattagcag 3065 agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg tggcctaact acggctacac 3125 tagaaggaca gtatttggta tctgcgctct gctgaagcca gttaccttcg gaaaaagagt 3185 tggtagctct tgatccggca aacaaaccac cgctggtagc ggtggttttt ttgtttgcaa 3245 gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat cctttgatct tttctacggg 3305 gtctgacgct cagtggaacg aaaactcacg ttaagggatt ttggtcatga gattatcaaa 3365 aaggatcttc acctagatcc ttttaaatta aaaatgaagt tttaaatcaa tctaaagtat 3425 atatgagtaa acttggtctg acagttacca atgcttaatc agtgaggcac ctatctcagc 3485 gatctgtcta tttcgttcat ccatagttgc ctgactcccc gtcgtgtaga taactacgat 3545 acgggagggc ttaccatctg gccccagtgc tgcaatgata ccgcgagacc cacgctcacc 3605 ggctccagat ttatcagcaa taaaccagcc agccggaagg gccgagcgca gaagtggtcc 3665 tgcaacttta tccgcctcca tccagtctat taattgttgc cgggaagcta gagtaagtag 3725 ttcgccagtt aatagtttgc gcaacgttgt tgccattgct acaggcatcg tggtgtcacg 3785 ctcgtcgttt ggtatggctt cattcagctc cggttcccaa cgatcaaggc gagttacatg 3845 atcccccatg ttgtgcaaaa aagcggttag ctccttcggt cctccgatcg ttgtcagaag 3905 taagttggcc gcagtgttat cactcatggt tatggcagca ctgcataatt ctcttactgt 3965 catgccatcc gtaagatgct tttctgtgac tggtgagtac tcaaccaagt cattctgaga 4025 atagtgtatg cggcgaccga gttgctcttg cccggcgtca atacgggata ataccgcgcc 4085 acatagcaga actttaaaag tgctcatcat tggaaaacgt tcttcggggc gaaaactctc 4145 aaggatctta ccgctgttga gatccagttc gatgtaaccc actcgtgcac ccaactgatc 4205 ttcagcatct tttactttca ccagcgtttc tgggtgagca aaaacaggaa ggcaaaatgc 4265 cgcaaaaaag ggaataaggg cgacacggaa atgttgaata ctcatactct tcctttttca 4325 atattattga agcatttatc agggttattg tctcatgagc ggatacatat ttgaatgtat 4385 ttagaaaaat aaacaaatag gggttccgcg cacatttccc cgaaaagtgc cac 4438 13 230 PRT Artificial Sequence Description of Artificial Sequence pTetO7Sag1-HXGPRT 13 Met His Ala Ser Lys Pro Ile Glu Asp Tyr Gly Lys Gly Lys Gly Arg 1 5 10 15 Ile Glu Pro Met Tyr Ile Pro Asp Asn Thr Phe Tyr Asn Ala Asp Asp 20 25 30 Phe Leu Val Pro Pro His Cys Lys Pro Tyr Ile Asp Lys Ile Leu Leu 35 40 45 Pro Gly Gly Leu Val Lys Asp Arg Val Glu Lys Leu Ala Tyr Asp Ile 50 55 60 His Arg Thr Tyr Phe Gly Glu Glu Leu His Ile Ile Cys Ile Leu Lys 65 70 75 80 Gly Ser Arg Gly Phe Phe Asn Leu Leu Ile Asp Tyr Leu Ala Thr Ile 85 90 95 Gln Asn Gly Arg Glu Ser Ser Val Pro Pro Phe Phe Glu His Tyr Val 100 105 110 Arg Leu Lys Ser Tyr Gln Asn Asp Asn Ser Thr Gly Gln Leu Thr Val 115 120 125 Leu Ser Asp Asp Leu Ser Ile Phe Arg Asp Lys His Val Leu Ile Val 130 135 140 Glu Asp Ile Val Asp Thr Gly Phe Thr Leu Thr Glu Phe Gly Glu Arg 145 150 155 160 Leu Lys Ala Val Gly Pro Lys Ser Met Arg Ile Ala Thr Leu Val Glu 165 170 175 Lys Arg Thr Asp Arg Ser Asn Ser Leu Lys Gly Asp Phe Val Gly Phe 180 185 190 Ser Ile Glu Asp Val Trp Ile Val Gly Cys Cys Tyr Asp Phe Asn Glu 195 200 205 Met Phe Arg Asp Phe Asp His Val Ala Val Leu Ser Asp Ala Ala Arg 210 215 220 Lys Lys Phe Glu Lys Ala 225 230 14 8287 DNA Artificial Sequence CDS (1193)..(4318) Description of Artificial Sequence pTetO7Sag1-LacZ 14 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga 60 ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct tcctttctcg 120 ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta gggttccgat 180 ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt tcacgtagtg 240 ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg ttctttaata 300 gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat tcttttgatt 360 tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat 420 ttaacgcgaa ttttaacaaa atattaacgc ttacaatttc cattcgccat tcaggctgcg 480 caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540 gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt cacgacgttg 600 taaaacgacg gccagtgagc gcgcgtaata cgactcacta tagggcgaat tgggtaccga 660 gctcgacttt cacttttctc tatcactgat agggagtggt aaactcgact ttcacttttc 720 tctatcactg atagggagtg gtaaactcga ctttcacttt tctctatcac tgatagggag 780 tggtaaactc gactttcact tttctctatc actgataggg agtggtaaac tcgactttca 840 cttttctcta tcactgatag ggagtggtaa actcgacttt cacttttctc tatcactgat 900 agggagtggt aaactcgact ttcacttttc tctatcactg atagggagtg gtaaactcga 960 ggtcgacggt atcgataagc ttcaatgtgc acctgtagga agctgtagtc actgctgatt 1020 ctcactgttc tcggcaaggg ccgacgaccg gagtacagtt tttgtgggca gagccgttgt 1080 gcagctttcc gttcttctcg gttgtgtcac atgtgtcatt gtcgtgtaaa cacacggttg 1140 tatgtcggtt tcgctgcacc acttcattat ttcttctggt tttttgacga gt atg cat 1198 Met His 1 gcc atg gag aag tta tta ttc cga agt tcc tat tct cta gaa agt ata 1246 Ala Met Glu Lys Leu Leu Phe Arg Ser Ser Tyr Ser Leu Glu Ser Ile 5 10 15 gga act tca agc ttg gca ctg gcc gtc gtt tta caa cgt cgt gac tgg 1294 Gly Thr Ser Ser Leu Ala Leu Ala Val Val Leu Gln Arg Arg Asp Trp 20 25 30 gaa aac cct ggc gtt acc caa ctt aat cgc ctt gca gca cat ccc cct 1342 Glu Asn Pro Gly Val Thr Gln Leu Asn Arg Leu Ala Ala His Pro Pro 35 40 45 50 ttc gcc agc tgg cgt aat agc gaa gag gcc cgc acc gat cgc cct tcc 1390 Phe Ala Ser Trp Arg Asn Ser Glu Glu Ala Arg Thr Asp Arg Pro Ser 55 60 65 caa cag ttg cgc agc ctg aat ggc gaa tgg cgc ttt gcc tgg ttt ccg 1438 Gln Gln Leu Arg Ser Leu Asn Gly Glu Trp Arg Phe Ala Trp Phe Pro 70 75 80 gca cca gaa gcg gtg ccg gaa agc tgg ctg gag tgc gat ctt cct gag 1486 Ala Pro Glu Ala Val Pro Glu Ser Trp Leu Glu Cys Asp Leu Pro Glu 85 90 95 gcc gat act gtc gtc gtc ccc tca aac tgg cag atg cac ggt tac gat 1534 Ala Asp Thr Val Val Val Pro Ser Asn Trp Gln Met His Gly Tyr Asp 100 105 110 gcg ccc atc tac acc aac gta acc tat ccc att acg gtc aat ccg ccg 1582 Ala Pro Ile Tyr Thr Asn Val Thr Tyr Pro Ile Thr Val Asn Pro Pro 115 120 125 130 ttt gtt ccc acg gag aat ccg acg ggt tgt tac tcg ctc aca ttt aat 1630 Phe Val Pro Thr Glu Asn Pro Thr Gly Cys Tyr Ser Leu Thr Phe Asn 135 140 145 gtt gat gaa agc tgg cta cag gaa ggc cag acg cga att att ttt gat 1678 Val Asp Glu Ser Trp Leu Gln Glu Gly Gln Thr Arg Ile Ile Phe Asp 150 155 160 ggc gtt aac tcg gcg ttt cat ctg tgg tgc aac ggg cgc tgg gtc ggt 1726 Gly Val Asn Ser Ala Phe His Leu Trp Cys Asn Gly Arg Trp Val Gly 165 170 175 tac ggc cag gac agt cgt ttg ccg tct gaa ttt gac ctg agc gca ttt 1774 Tyr Gly Gln Asp Ser Arg Leu Pro Ser Glu Phe Asp Leu Ser Ala Phe 180 185 190 tta cgc gcc gga gaa aac cgc ctc gcg gtg atg gtg ctg cgt tgg agt 1822 Leu Arg Ala Gly Glu Asn Arg Leu Ala Val Met Val Leu Arg Trp Ser 195 200 205 210 gac ggc agt tat ctg gaa gat cag gat atg tgg cgg atg agc ggc att 1870 Asp Gly Ser Tyr Leu Glu Asp Gln Asp Met Trp Arg Met Ser Gly Ile 215 220 225 ttc cgt gac gtc tcg ttg ctg cat aaa ccg act aca caa atc agc gat 1918 Phe Arg Asp Val Ser Leu Leu His Lys Pro Thr Thr Gln Ile Ser Asp 230 235 240 ttc cat gtt gcc act cgc ttt aat gat gat ttc agc cgc gct gta ctg 1966 Phe His Val Ala Thr Arg Phe Asn Asp Asp Phe Ser Arg Ala Val Leu 245 250 255 gag gct gaa gtt cag atg tgc ggc gag ttg cgt gac tac cta cgg gta 2014 Glu Ala Glu Val Gln Met Cys Gly Glu Leu Arg Asp Tyr Leu Arg Val 260 265 270 aca gtt tct tta tgg cag ggt gaa acg cag gtc gcc agc ggc acc gcg 2062 Thr Val Ser Leu Trp Gln Gly Glu Thr Gln Val Ala Ser Gly Thr Ala 275 280 285 290 cct ttc ggc ggt gaa att atc gat gag cgt ggt ggt tat gcc gat cgc 2110 Pro Phe Gly Gly Glu Ile Ile Asp Glu Arg Gly Gly Tyr Ala Asp Arg 295 300 305 gtc aca cta cgt ctg aac gtc gaa aac ccg aaa ctg tgg agc gcc gaa 2158 Val Thr Leu Arg Leu Asn Val Glu Asn Pro Lys Leu Trp Ser Ala Glu 310 315 320 atc ccg aat ctc tat cgt gcg gtg gtt gaa ctg cac acc gcc gac ggc 2206 Ile Pro Asn Leu Tyr Arg Ala Val Val Glu Leu His Thr Ala Asp Gly 325 330 335 acg ctg att gaa gca gaa gcc tgc gat gtc ggt ttc cgc gag gtg cgg 2254 Thr Leu Ile Glu Ala Glu Ala Cys Asp Val Gly Phe Arg Glu Val Arg 340 345 350 att gaa aat ggt ctg ctg ctg ctg aac ggc aag ccg ttg ctg att cga 2302 Ile Glu Asn Gly Leu Leu Leu Leu Asn Gly Lys Pro Leu Leu Ile Arg 355 360 365 370 ggc gtt aac cgt cac gag cat cat cct ctg cat ggt cag gtc atg gat 2350 Gly Val Asn Arg His Glu His His Pro Leu His Gly Gln Val Met Asp 375 380 385 gag cag acg atg gtg cag gat atc ctg ctg atg aag cag aac aac ttt 2398 Glu Gln Thr Met Val Gln Asp Ile Leu Leu Met Lys Gln Asn Asn Phe 390 395 400 aac gcc gtg cgc tgt tcg cat tat ccg aac cat ccg ctg tgg tac acg 2446 Asn Ala Val Arg Cys Ser His Tyr Pro Asn His Pro Leu Trp Tyr Thr 405 410 415 ctg tgc gac cgc tac ggc ctg tat gtg gtg gat gaa gcc aat att gaa 2494 Leu Cys Asp Arg Tyr Gly Leu Tyr Val Val Asp Glu Ala Asn Ile Glu 420 425 430 acc cac ggc atg gtg cca atg aat cgt ctg acc gat gat ccg cgc tgg 2542 Thr His Gly Met Val Pro Met Asn Arg Leu Thr Asp Asp Pro Arg Trp 435 440 445 450 cta ccg gcg atg agc gaa cgc gta acg cga atg gtg cag cgc gat cgt 2590 Leu Pro Ala Met Ser Glu Arg Val Thr Arg Met Val Gln Arg Asp Arg 455 460 465 aat cac ccg agt gtg atc atc tgg tcg ctg ggg aat gaa tca ggc cac 2638 Asn His Pro Ser Val Ile Ile Trp Ser Leu Gly Asn Glu Ser Gly His 470 475 480 ggc gct aat cac gac gcg ctg tat cgc tgg atc aaa tct gtc gat cct 2686 Gly Ala Asn His Asp Ala Leu Tyr Arg Trp Ile Lys Ser Val Asp Pro 485 490 495 tcc cgc ccg gtg cag tat gaa ggc ggc gga gcc gac acc acg gcc acc 2734 Ser Arg Pro Val Gln Tyr Glu Gly Gly Gly Ala Asp Thr Thr Ala Thr 500 505 510 gat att att tgc ccg atg tac gcg cgc gtg gat gaa gac cag ccc ttc 2782 Asp Ile Ile Cys Pro Met Tyr Ala Arg Val Asp Glu Asp Gln Pro Phe 515 520 525 530 ccg gct gtg ccg aaa tgg tcc atc aaa aaa tgg ctt tcg cta cct gga 2830 Pro Ala Val Pro Lys Trp Ser Ile Lys Lys Trp Leu Ser Leu Pro Gly 535 540 545 gag acg cgc ccg ctg atc ctt tgc gaa tac gcc cac gcg atg ggt aac 2878 Glu Thr Arg Pro Leu Ile Leu Cys Glu Tyr Ala His Ala Met Gly Asn 550 555 560 agt ctt ggc ggt ttc gct aaa tac tgg cag gcg ttt cgt cag tat ccc 2926 Ser Leu Gly Gly Phe Ala Lys Tyr Trp Gln Ala Phe Arg Gln Tyr Pro 565 570 575 cgt tta cag ggc ggc ttc gtc tgg gac tgg gtg gat cag tcg ctg att 2974 Arg Leu Gln Gly Gly Phe Val Trp Asp Trp Val Asp Gln Ser Leu Ile 580 585 590 aaa tat gat gaa aac ggc aac ccg tgg tcg gct tac ggc ggt gat ttt 3022 Lys Tyr Asp Glu Asn Gly Asn Pro Trp Ser Ala Tyr Gly Gly Asp Phe 595 600 605 610 ggc gat acg ccg aac gat cgc cag ttc tgt atg aac ggt ctg gtc ttt 3070 Gly Asp Thr Pro Asn Asp Arg Gln Phe Cys Met Asn Gly Leu Val Phe 615 620 625 gcc gac cgc acg ccg cat cca gcg ctg acg gaa gca aaa cac cag cag 3118 Ala Asp Arg Thr Pro His Pro Ala Leu Thr Glu Ala Lys His Gln Gln 630 635 640 cag ttt ttc cag ttc cgt tta tcc ggg caa acc atc gaa gtg acc agc 3166 Gln Phe Phe Gln Phe Arg Leu Ser Gly Gln Thr Ile Glu Val Thr Ser 645 650 655 gaa tac ctg ttc cgt cat agc gat aac gag ctc ctg cac tgg atg gtg 3214 Glu Tyr Leu Phe Arg His Ser Asp Asn Glu Leu Leu His Trp Met Val 660 665 670 gcg ctg gat ggt aag ccg ctg gca agc ggt gaa gtg cct ctg gat gtc 3262 Ala Leu Asp Gly Lys Pro Leu Ala Ser Gly Glu Val Pro Leu Asp Val 675 680 685 690 gct cca caa ggt aaa cag ttg att gaa ctg cct gaa cta ccg cag ccg 3310 Ala Pro Gln Gly Lys Gln Leu Ile Glu Leu Pro Glu Leu Pro Gln Pro 695 700 705 gag agc gcc ggg caa ctc tgg ctc aca gta cgc gta gtg caa ccg aac 3358 Glu Ser Ala Gly Gln Leu Trp Leu Thr Val Arg Val Val Gln Pro Asn 710 715 720 gcg acc gca tgg tca gaa gcc ggg cac atc agc gcc tgg cag cag tgg 3406 Ala Thr Ala Trp Ser Glu Ala Gly His Ile Ser Ala Trp Gln Gln Trp 725 730 735 cgt ctg gcg gaa aac ctc agt gtg acg ctc ccc gcc gcg tcc cac gcc 3454 Arg Leu Ala Glu Asn Leu Ser Val Thr Leu Pro Ala Ala Ser His Ala 740 745 750 atc ccg cat ctg acc acc agc gaa atg gat ttt tgc atc gag ctg ggt 3502 Ile Pro His Leu Thr Thr Ser Glu Met Asp Phe Cys Ile Glu Leu Gly 755 760 765 770 aat aag cgt tgg caa ttt aac cgc cag tca ggc ttt ctt tca cag atg 3550 Asn Lys Arg Trp Gln Phe Asn Arg Gln Ser Gly Phe Leu Ser Gln Met 775 780 785 tgg att ggc gat aaa aaa caa ctg ctg acg ccg ctg cgc gat cag ttc 3598 Trp Ile Gly Asp Lys Lys Gln Leu Leu Thr Pro Leu Arg Asp Gln Phe 790 795 800 acc cgt gca ccg ctg gat aac gac att ggc gta agt gaa gcg acc cgc 3646 Thr Arg Ala Pro Leu Asp Asn Asp Ile Gly Val Ser Glu Ala Thr Arg 805 810 815 att gac cct aac gcc tgg gtc gaa cgc tgg aag gcg gcg ggc cat tac 3694 Ile Asp Pro Asn Ala Trp Val Glu Arg Trp Lys Ala Ala Gly His Tyr 820 825 830 cag gcc gaa gca gcg ttg ttg cag tgc acg gca gat aca ctt gct gat 3742 Gln Ala Glu Ala Ala Leu Leu Gln Cys Thr Ala Asp Thr Leu Ala Asp 835 840 845 850 gcg gtg ctg att acg acc gct cac gcg tgg cag cat cag ggg aaa acc 3790 Ala Val Leu Ile Thr Thr Ala His Ala Trp Gln His Gln Gly Lys Thr 855 860 865 tta ttt atc agc cgg aaa acc tac cgg att gat ggt agt ggt caa atg 3838 Leu Phe Ile Ser Arg Lys Thr Tyr Arg Ile Asp Gly Ser Gly Gln Met 870 875 880 gcg att acc gtt gat gtt gaa gtg gcg agc gat aca ccg cat ccg gcg 3886 Ala Ile Thr Val Asp Val Glu Val Ala Ser Asp Thr Pro His Pro Ala 885 890 895 cgg att ggc ctg aac tgc cag ctg gcg cag gta gca gag cgg gta aac 3934 Arg Ile Gly Leu Asn Cys Gln Leu Ala Gln Val Ala Glu Arg Val Asn 900 905 910 tgg ctc gga tta ggg ccg caa gaa aac tat ccc gac cgc ctt act gcc 3982 Trp Leu Gly Leu Gly Pro Gln Glu Asn Tyr Pro Asp Arg Leu Thr Ala 915 920 925 930 gcc tgt ttt gac cgc tgg gat ctg cca ttg tca gac atg tat acc ccg 4030 Ala Cys Phe Asp Arg Trp Asp Leu Pro Leu Ser Asp Met Tyr Thr Pro 935 940 945 tac gtc ttc ccg agc gaa aac ggt ctg cgc tgc ggg acg cgc gaa ttg 4078 Tyr Val Phe Pro Ser Glu Asn Gly Leu Arg Cys Gly Thr Arg Glu Leu 950 955 960 aat tat ggc cca cac cag tgg cgc ggc gac ttc cag ttc aac atc agc 4126 Asn Tyr Gly Pro His Gln Trp Arg Gly Asp Phe Gln Phe Asn Ile Ser 965 970 975 cgc tac agt caa cag caa ctg atg gaa acc agc cat cgc cat ctg ctg 4174 Arg Tyr Ser Gln Gln Gln Leu Met Glu Thr Ser His Arg His Leu Leu 980 985 990 cac gcg gaa gaa ggc aca tgg ctg aat atc gac ggt ttc cat atg ggg 4222 His Ala Glu Glu Gly Thr Trp Leu Asn Ile Asp Gly Phe His Met Gly 995 1000 1005 1010 att ggt ggc gac gac tcc tgg agc ccg tca gta tcg gcg gaa ttc cag 4270 Ile Gly Gly Asp Asp Ser Trp Ser Pro Ser Val Ser Ala Glu Phe Gln 1015 1020 1025 ctg agc gcc ggt cgc tac cat tac cag ttg gtc tgg tgt caa aaa taa 4318 Leu Ser Ala Gly Arg Tyr His Tyr Gln Leu Val Trp Cys Gln Lys 1030 1035 1040 taattaatta atcaccgttg tgctcacttc tcaaatcgac aaaggaaaca cacttcgtgc 4378 agcatgtgcc ccattataaa gaaactgagt tgttccgctg tggcttgcag gtgtcacatc 4438 cacaaaaacc ggccgactct aaataggagt gtttcgcagc aagcagcgaa agtttatgac 4498 tgggtccgaa tctctgaacg gatgtgtggc ggacctggct gatgttgatc gccgtcgacg 4558 gtatcgataa gcttgatatg catgtccgcg ttcgtgaaat tctctgatca agcggagtga 4618 tcaccaatca tcgtctcagc gggatgacgt tgcggcaagg gccggctcgc ggtgggcagt 4678 cagatgccga acgtaactca ggacggcttg cgctcatcgc agaacagggg tggtgcctgc 4738 attgggtgcg gttggtgatc ctggttggac cggtggagat gcgcgcgcac gaaggggatg 4798 tgtcagaaac attttgtttg ttctctgtga acttttagat gtgttaaagg cggcgaatat 4858 tagcagagag tcctccttgt tccattctct cttgaatttc gccctttcct tctctttgcg 4918 agtgtggtag agaacaagca ctcgttcgcc gtccctgacg acgcaacccg cgcagaagac 4978 atccaccaaa cggtgttaca caatcacctt gtgtgaagtt cttgcggaaa actactcgtt 5038 ggcatttttt cttgaattcc ctttttcgac aaaatgcatg agaaaaaaat cactggatat 5098 accaccgttg atatatccca atcgcatcgt aaagaacatt ttgaggcatt tcagtcagtt 5158 gctcaatgta cctataacca gaccgttcag ctggatatta cggccttttt aaagaccgta 5218 aagaaaaata agcacaagtt ttatccggcc tttattcaca ttcttgcccg cctgatgaat 5278 gctcatccgg aattccgtat ggcaatgaaa gacggtgagc tggtgatatg ggatagtgtt 5338 cacccttgtt acaccgtttt ccatgagcaa actgaaacgt tttcatcgct ctggagtgaa 5398 taccacgacg atttccggca gtttctacac atatattcgc aagatgtggc gtgttacggt 5458 gaaaacctgg cctatttccc taaagggttt attgagaata tgtttttcgt ctcagccaat 5518 ccctgggtga gtttcaccag ttttgattta aacgtggcca atatggacaa cttcttcgcc 5578 cccgttttca ccatgggcaa atattatacg caaggcgaca aggtgctgat gccgctggcg 5638 attcaggttc atcatgccgt ctgtgatggc ttccatgtcg gcagaatgct taatgaatta 5698 caacagtact gcgatgagtg gcagggcggg gcttaattaa tcaccgttgt gctcacttct 5758 caaatcgaca aaggaaacac acttcgtgca gcatgtgccc cattataaag aaactgagtt 5818 gttccgctgt ggcttgcagg tgtcacatcc acaaaaaccg gccgactcta aataggagtg 5878 tttcgcagca agcagcgaaa gtttatgact gggtccgaat ctctgaacgg atgtgtggcg 5938 gacctggctg atgttgatcg ccgtcgacac acgcgccaca tgggtcaata cacaagacag 5998 ctatcagttg ttttagtcga accggttaac acaattcttg cccccccgag ggggatccac 6058 tagagcggcc gccaccgcgg tggagctcca gcttttgttc cctttagtga gggttaattg 6118 cgcgcttggc gtaatcatgg tcatagctgt ttcctgtgtg aaattgttat ccgctcacaa 6178 ttccacacaa catacgagcc ggaagcataa agtgtaaagc ctggggtgcc taatgagtga 6238 gctaactcac attaattgcg ttgcgctcac tgcccgcttt ccagtcggga aacctgtcgt 6298 gccagctgca ttaatgaatc ggccaacgcg cggggagagg cggtttgcgt attgggcgct 6358 cttccgcttc ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat 6418 cagctcactc aaaggcggta atacggttat ccacagaatc aggggataac gcaggaaaga 6478 acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt 6538 ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgacgctca agtcagaggt 6598 ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc 6658 gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa 6718 gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct 6778 ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta 6838 actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg 6898 gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc 6958 ctaactacgg ctacactaga aggacagtat ttggtatctg cgctctgctg aagccagtta 7018 ccttcggaaa aagagttggt agctcttgat ccggcaaaca aaccaccgct ggtagcggtg 7078 gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa gaagatcctt 7138 tgatcttttc tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa gggattttgg 7198 tcatgagatt atcaaaaagg atcttcacct agatcctttt aaattaaaaa tgaagtttta 7258 aatcaatcta aagtatatat gagtaaactt ggtctgacag ttaccaatgc ttaatcagtg 7318 aggcacctat ctcagcgatc tgtctatttc gttcatccat agttgcctga ctccccgtcg 7378 tgtagataac tacgatacgg gagggcttac catctggccc cagtgctgca atgataccgc 7438 gagacccacg ctcaccggct ccagatttat cagcaataaa ccagccagcc ggaagggccg 7498 agcgcagaag tggtcctgca actttatccg cctccatcca gtctattaat tgttgccggg 7558 aagctagagt aagtagttcg ccagttaata gtttgcgcaa cgttgttgcc attgctacag 7618 gcatcgtggt gtcacgctcg tcgtttggta tggcttcatt cagctccggt tcccaacgat 7678 caaggcgagt tacatgatcc cccatgttgt gcaaaaaagc ggttagctcc ttcggtcctc 7738 cgatcgttgt cagaagtaag ttggccgcag tgttatcact catggttatg gcagcactgc 7798 ataattctct tactgtcatg ccatccgtaa gatgcttttc tgtgactggt gagtactcaa 7858 ccaagtcatt ctgagaatag tgtatgcggc gaccgagttg ctcttgcccg gcgtcaatac 7918 gggataatac cgcgccacat agcagaactt taaaagtgct catcattgga aaacgttctt 7978 cggggcgaaa actctcaagg atcttaccgc tgttgagatc cagttcgatg taacccactc 8038 gtgcacccaa ctgatcttca gcatctttta ctttcaccag cgtttctggg tgagcaaaaa 8098 caggaaggca aaatgccgca aaaaagggaa taagggcgac acggaaatgt tgaatactca 8158 tactcttcct ttttcaatat tattgaagca tttatcaggg ttattgtctc atgagcggat 8218 acatatttga atgtatttag aaaaataaac aaataggggt tccgcgcaca tttccccgaa 8278 aagtgccac 8287 15 1041 PRT Artificial Sequence Description of Artificial Sequence pTetO7Sag1-LacZ 15 Met His Ala Met Glu Lys Leu Leu Phe Arg Ser Ser Tyr Ser Leu Glu 1 5 10 15 Ser Ile Gly Thr Ser Ser Leu Ala Leu Ala Val Val Leu Gln Arg Arg 20 25 30 Asp Trp Glu Asn Pro Gly Val Thr Gln Leu Asn Arg Leu Ala Ala His 35 40 45 Pro Pro Phe Ala Ser Trp Arg Asn Ser Glu Glu Ala Arg Thr Asp Arg 50 55 60 Pro Ser Gln Gln Leu Arg Ser Leu Asn Gly Glu Trp Arg Phe Ala Trp 65 70 75 80 Phe Pro Ala Pro Glu Ala Val Pro Glu Ser Trp Leu Glu Cys Asp Leu 85 90 95 Pro Glu Ala Asp Thr Val Val Val Pro Ser Asn Trp Gln Met His Gly 100 105 110 Tyr Asp Ala Pro Ile Tyr Thr Asn Val Thr Tyr Pro Ile Thr Val Asn 115 120 125 Pro Pro Phe Val Pro Thr Glu Asn Pro Thr Gly Cys Tyr Ser Leu Thr 130 135 140 Phe Asn Val Asp Glu Ser Trp Leu Gln Glu Gly Gln Thr Arg Ile Ile 145 150 155 160 Phe Asp Gly Val Asn Ser Ala Phe His Leu Trp Cys Asn Gly Arg Trp 165 170 175 Val Gly Tyr Gly Gln Asp Ser Arg Leu Pro Ser Glu Phe Asp Leu Ser 180 185 190 Ala Phe Leu Arg Ala Gly Glu Asn Arg Leu Ala Val Met Val Leu Arg 195 200 205 Trp Ser Asp Gly Ser Tyr Leu Glu Asp Gln Asp Met Trp Arg Met Ser 210 215 220 Gly Ile Phe Arg Asp Val Ser Leu Leu His Lys Pro Thr Thr Gln Ile 225 230 235 240 Ser Asp Phe His Val Ala Thr Arg Phe Asn Asp Asp Phe Ser Arg Ala 245 250 255 Val Leu Glu Ala Glu Val Gln Met Cys Gly Glu Leu Arg Asp Tyr Leu 260 265 270 Arg Val Thr Val Ser Leu Trp Gln Gly Glu Thr Gln Val Ala Ser Gly 275 280 285 Thr Ala Pro Phe Gly Gly Glu Ile Ile Asp Glu Arg Gly Gly Tyr Ala 290 295 300 Asp Arg Val Thr Leu Arg Leu Asn Val Glu Asn Pro Lys Leu Trp Ser 305 310 315 320 Ala Glu Ile Pro Asn Leu Tyr Arg Ala Val Val Glu Leu His Thr Ala 325 330 335 Asp Gly Thr Leu Ile Glu Ala Glu Ala Cys Asp Val Gly Phe Arg Glu 340 345 350 Val Arg Ile Glu Asn Gly Leu Leu Leu Leu Asn Gly Lys Pro Leu Leu 355 360 365 Ile Arg Gly Val Asn Arg His Glu His His Pro Leu His Gly Gln Val 370 375 380 Met Asp Glu Gln Thr Met Val Gln Asp Ile Leu Leu Met Lys Gln Asn 385 390 395 400 Asn Phe Asn Ala Val Arg Cys Ser His Tyr Pro Asn His Pro Leu Trp 405 410 415 Tyr Thr Leu Cys Asp Arg Tyr Gly Leu Tyr Val Val Asp Glu Ala Asn 420 425 430 Ile Glu Thr His Gly Met Val Pro Met Asn Arg Leu Thr Asp Asp Pro 435 440 445 Arg Trp Leu Pro Ala Met Ser Glu Arg Val Thr Arg Met Val Gln Arg 450 455 460 Asp Arg Asn His Pro Ser Val Ile Ile Trp Ser Leu Gly Asn Glu Ser 465 470 475 480 Gly His Gly Ala Asn His Asp Ala Leu Tyr Arg Trp Ile Lys Ser Val 485 490 495 Asp Pro Ser Arg Pro Val Gln Tyr Glu Gly Gly Gly Ala Asp Thr Thr 500 505 510 Ala Thr Asp Ile Ile Cys Pro Met Tyr Ala Arg Val Asp Glu Asp Gln 515 520 525 Pro Phe Pro Ala Val Pro Lys Trp Ser Ile Lys Lys Trp Leu Ser Leu 530 535 540 Pro Gly Glu Thr Arg Pro Leu Ile Leu Cys Glu Tyr Ala His Ala Met 545 550 555 560 Gly Asn Ser Leu Gly Gly Phe Ala Lys Tyr Trp Gln Ala Phe Arg Gln 565 570 575 Tyr Pro Arg Leu Gln Gly Gly Phe Val Trp Asp Trp Val Asp Gln Ser 580 585 590 Leu Ile Lys Tyr Asp Glu Asn Gly Asn Pro Trp Ser Ala Tyr Gly Gly 595 600 605 Asp Phe Gly Asp Thr Pro Asn Asp Arg Gln Phe Cys Met Asn Gly Leu 610 615 620 Val Phe Ala Asp Arg Thr Pro His Pro Ala Leu Thr Glu Ala Lys His 625 630 635 640 Gln Gln Gln Phe Phe Gln Phe Arg Leu Ser Gly Gln Thr Ile Glu Val 645 650 655 Thr Ser Glu Tyr Leu Phe Arg His Ser Asp Asn Glu Leu Leu His Trp 660 665 670 Met Val Ala Leu Asp Gly Lys Pro Leu Ala Ser Gly Glu Val Pro Leu 675 680 685 Asp Val Ala Pro Gln Gly Lys Gln Leu Ile Glu Leu Pro Glu Leu Pro 690 695 700 Gln Pro Glu Ser Ala Gly Gln Leu Trp Leu Thr Val Arg Val Val Gln 705 710 715 720 Pro Asn Ala Thr Ala Trp Ser Glu Ala Gly His Ile Ser Ala Trp Gln 725 730 735 Gln Trp Arg Leu Ala Glu Asn Leu Ser Val Thr Leu Pro Ala Ala Ser 740 745 750 His Ala Ile Pro His Leu Thr Thr Ser Glu Met Asp Phe Cys Ile Glu 755 760 765 Leu Gly Asn Lys Arg Trp Gln Phe Asn Arg Gln Ser Gly Phe Leu Ser 770 775 780 Gln Met Trp Ile Gly Asp Lys Lys Gln Leu Leu Thr Pro Leu Arg Asp 785 790 795 800 Gln Phe Thr Arg Ala Pro Leu Asp Asn Asp Ile Gly Val Ser Glu Ala 805 810 815 Thr Arg Ile Asp Pro Asn Ala Trp Val Glu Arg Trp Lys Ala Ala Gly 820 825 830 His Tyr Gln Ala Glu Ala Ala Leu Leu Gln Cys Thr Ala Asp Thr Leu 835 840 845 Ala Asp Ala Val Leu Ile Thr Thr Ala His Ala Trp Gln His Gln Gly 850 855 860 Lys Thr Leu Phe Ile Ser Arg Lys Thr Tyr Arg Ile Asp Gly Ser Gly 865 870 875 880 Gln Met Ala Ile Thr Val Asp Val Glu Val Ala Ser Asp Thr Pro His 885 890 895 Pro Ala Arg Ile Gly Leu Asn Cys Gln Leu Ala Gln Val Ala Glu Arg 900 905 910 Val Asn Trp Leu Gly Leu Gly Pro Gln Glu Asn Tyr Pro Asp Arg Leu 915 920 925 Thr Ala Ala Cys Phe Asp Arg Trp Asp Leu Pro Leu Ser Asp Met Tyr 930 935 940 Thr Pro Tyr Val Phe Pro Ser Glu Asn Gly Leu Arg Cys Gly Thr Arg 945 950 955 960 Glu Leu Asn Tyr Gly Pro His Gln Trp Arg Gly Asp Phe Gln Phe Asn 965 970 975 Ile Ser Arg Tyr Ser Gln Gln Gln Leu Met Glu Thr Ser His Arg His 980 985 990 Leu Leu His Ala Glu Glu Gly Thr Trp Leu Asn Ile Asp Gly Phe His 995 1000 1005 Met Gly Ile Gly Gly Asp Asp Ser Trp Ser Pro Ser Val Ser Ala Glu 1010 1015 1020 Phe Gln Leu Ser Ala Gly Arg Tyr His Tyr Gln Leu Val Trp Cys Gln 1025 1030 1035 1040 Lys 16 8364 DNA Artificial Sequence CDS (1270)..(4395) Description of Artificial Sequence pTetO7Sag4-LacZ 16 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga 60 ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct tcctttctcg 120 ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta gggttccgat 180 ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt tcacgtagtg 240 ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg ttctttaata 300 gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat tcttttgatt 360 tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat 420 ttaacgcgaa ttttaacaaa atattaacgc ttacaatttc cattcgccat tcaggctgcg 480 caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540 gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt cacgacgttg 600 taaaacgacg gccagtgagc gcgcgtaata cgactcacta tagggcgaat tgggtaccga 660 gctcgacttt cacttttctc tatcactgat agggagtggt aaactcgact ttcacttttc 720 tctatcactg atagggagtg gtaaactcga ctttcacttt tctctatcac tgatagggag 780 tggtaaactc gactttcact tttctctatc actgataggg agtggtaaac tcgactttca 840 cttttctcta tcactgatag ggagtggtaa actcgacttt cacttttctc tatcactgat 900 agggagtggt aaactcgact ttcacttttc tctatcactg atagggagtg gtaaactcga 960 ggtcgacggt atcgataagc ttacgccgct gagactaact agaaagaagt gtgcaacagt 1020 tcatgaggga caaaaggaat gtgatgcggt tcgcttgaag aaggaatgtt taagcacgtc 1080 aacaatacgc cttggcgaat gttcatgact gttcatgtgg ttcatcggat catttgaaaa 1140 catcgtgagg ctggtacctg gtcgcaaacg tcgtagtgta gtaccgacaa taacgtcgtc 1200 gttcaagggg acgcagttct cggaagacgc gtcgcagcat actgcaactg ctttcgtctg 1260 tcttcaacc atg cat gcc atg gag aag tta tta ttc cga agt tcc tat tct 1311 Met His Ala Met Glu Lys Leu Leu Phe Arg Ser Ser Tyr Ser 1 5 10 cta gaa agt ata gga act tca agc ttg gca ctg gcc gtc gtt tta caa 1359 Leu Glu Ser Ile Gly Thr Ser Ser Leu Ala Leu Ala Val Val Leu Gln 15 20 25 30 cgt cgt gac tgg gaa aac cct ggc gtt acc caa ctt aat cgc ctt gca 1407 Arg Arg Asp Trp Glu Asn Pro Gly Val Thr Gln Leu Asn Arg Leu Ala 35 40 45 gca cat ccc cct ttc gcc agc tgg cgt aat agc gaa gag gcc cgc acc 1455 Ala His Pro Pro Phe Ala Ser Trp Arg Asn Ser Glu Glu Ala Arg Thr 50 55 60 gat cgc cct tcc caa cag ttg cgc agc ctg aat ggc gaa tgg cgc ttt 1503 Asp Arg Pro Ser Gln Gln Leu Arg Ser Leu Asn Gly Glu Trp Arg Phe 65 70 75 gcc tgg ttt ccg gca cca gaa gcg gtg ccg gaa agc tgg ctg gag tgc 1551 Ala Trp Phe Pro Ala Pro Glu Ala Val Pro Glu Ser Trp Leu Glu Cys 80 85 90 gat ctt cct gag gcc gat act gtc gtc gtc ccc tca aac tgg cag atg 1599 Asp Leu Pro Glu Ala Asp Thr Val Val Val Pro Ser Asn Trp Gln Met 95 100 105 110 cac ggt tac gat gcg ccc atc tac acc aac gta acc tat ccc att acg 1647 His Gly Tyr Asp Ala Pro Ile Tyr Thr Asn Val Thr Tyr Pro Ile Thr 115 120 125 gtc aat ccg ccg ttt gtt ccc acg gag aat ccg acg ggt tgt tac tcg 1695 Val Asn Pro Pro Phe Val Pro Thr Glu Asn Pro Thr Gly Cys Tyr Ser 130 135 140 ctc aca ttt aat gtt gat gaa agc tgg cta cag gaa ggc cag acg cga 1743 Leu Thr Phe Asn Val Asp Glu Ser Trp Leu Gln Glu Gly Gln Thr Arg 145 150 155 att att ttt gat ggc gtt aac tcg gcg ttt cat ctg tgg tgc aac ggg 1791 Ile Ile Phe Asp Gly Val Asn Ser Ala Phe His Leu Trp Cys Asn Gly 160 165 170 cgc tgg gtc ggt tac ggc cag gac agt cgt ttg ccg tct gaa ttt gac 1839 Arg Trp Val Gly Tyr Gly Gln Asp Ser Arg Leu Pro Ser Glu Phe Asp 175 180 185 190 ctg agc gca ttt tta cgc gcc gga gaa aac cgc ctc gcg gtg atg gtg 1887 Leu Ser Ala Phe Leu Arg Ala Gly Glu Asn Arg Leu Ala Val Met Val 195 200 205 ctg cgt tgg agt gac ggc agt tat ctg gaa gat cag gat atg tgg cgg 1935 Leu Arg Trp Ser Asp Gly Ser Tyr Leu Glu Asp Gln Asp Met Trp Arg 210 215 220 atg agc ggc att ttc cgt gac gtc tcg ttg ctg cat aaa ccg act aca 1983 Met Ser Gly Ile Phe Arg Asp Val Ser Leu Leu His Lys Pro Thr Thr 225 230 235 caa atc agc gat ttc cat gtt gcc act cgc ttt aat gat gat ttc agc 2031 Gln Ile Ser Asp Phe His Val Ala Thr Arg Phe Asn Asp Asp Phe Ser 240 245 250 cgc gct gta ctg gag gct gaa gtt cag atg tgc ggc gag ttg cgt gac 2079 Arg Ala Val Leu Glu Ala Glu Val Gln Met Cys Gly Glu Leu Arg Asp 255 260 265 270 tac cta cgg gta aca gtt tct tta tgg cag ggt gaa acg cag gtc gcc 2127 Tyr Leu Arg Val Thr Val Ser Leu Trp Gln Gly Glu Thr Gln Val Ala 275 280 285 agc ggc acc gcg cct ttc ggc ggt gaa att atc gat gag cgt ggt ggt 2175 Ser Gly Thr Ala Pro Phe Gly Gly Glu Ile Ile Asp Glu Arg Gly Gly 290 295 300 tat gcc gat cgc gtc aca cta cgt ctg aac gtc gaa aac ccg aaa ctg 2223 Tyr Ala Asp Arg Val Thr Leu Arg Leu Asn Val Glu Asn Pro Lys Leu 305 310 315 tgg agc gcc gaa atc ccg aat ctc tat cgt gcg gtg gtt gaa ctg cac 2271 Trp Ser Ala Glu Ile Pro Asn Leu Tyr Arg Ala Val Val Glu Leu His 320 325 330 acc gcc gac ggc acg ctg att gaa gca gaa gcc tgc gat gtc ggt ttc 2319 Thr Ala Asp Gly Thr Leu Ile Glu Ala Glu Ala Cys Asp Val Gly Phe 335 340 345 350 cgc gag gtg cgg att gaa aat ggt ctg ctg ctg ctg aac ggc aag ccg 2367 Arg Glu Val Arg Ile Glu Asn Gly Leu Leu Leu Leu Asn Gly Lys Pro 355 360 365 ttg ctg att cga ggc gtt aac cgt cac gag cat cat cct ctg cat ggt 2415 Leu Leu Ile Arg Gly Val Asn Arg His Glu His His Pro Leu His Gly 370 375 380 cag gtc atg gat gag cag acg atg gtg cag gat atc ctg ctg atg aag 2463 Gln Val Met Asp Glu Gln Thr Met Val Gln Asp Ile Leu Leu Met Lys 385 390 395 cag aac aac ttt aac gcc gtg cgc tgt tcg cat tat ccg aac cat ccg 2511 Gln Asn Asn Phe Asn Ala Val Arg Cys Ser His Tyr Pro Asn His Pro 400 405 410 ctg tgg tac acg ctg tgc gac cgc tac ggc ctg tat gtg gtg gat gaa 2559 Leu Trp Tyr Thr Leu Cys Asp Arg Tyr Gly Leu Tyr Val Val Asp Glu 415 420 425 430 gcc aat att gaa acc cac ggc atg gtg cca atg aat cgt ctg acc gat 2607 Ala Asn Ile Glu Thr His Gly Met Val Pro Met Asn Arg Leu Thr Asp 435 440 445 gat ccg cgc tgg cta ccg gcg atg agc gaa cgc gta acg cga atg gtg 2655 Asp Pro Arg Trp Leu Pro Ala Met Ser Glu Arg Val Thr Arg Met Val 450 455 460 cag cgc gat cgt aat cac ccg agt gtg atc atc tgg tcg ctg ggg aat 2703 Gln Arg Asp Arg Asn His Pro Ser Val Ile Ile Trp Ser Leu Gly Asn 465 470 475 gaa tca ggc cac ggc gct aat cac gac gcg ctg tat cgc tgg atc aaa 2751 Glu Ser Gly His Gly Ala Asn His Asp Ala Leu Tyr Arg Trp Ile Lys 480 485 490 tct gtc gat cct tcc cgc ccg gtg cag tat gaa ggc ggc gga gcc gac 2799 Ser Val Asp Pro Ser Arg Pro Val Gln Tyr Glu Gly Gly Gly Ala Asp 495 500 505 510 acc acg gcc acc gat att att tgc ccg atg tac gcg cgc gtg gat gaa 2847 Thr Thr Ala Thr Asp Ile Ile Cys Pro Met Tyr Ala Arg Val Asp Glu 515 520 525 gac cag ccc ttc ccg gct gtg ccg aaa tgg tcc atc aaa aaa tgg ctt 2895 Asp Gln Pro Phe Pro Ala Val Pro Lys Trp Ser Ile Lys Lys Trp Leu 530 535 540 tcg cta cct gga gag acg cgc ccg ctg atc ctt tgc gaa tac gcc cac 2943 Ser Leu Pro Gly Glu Thr Arg Pro Leu Ile Leu Cys Glu Tyr Ala His 545 550 555 gcg atg ggt aac agt ctt ggc ggt ttc gct aaa tac tgg cag gcg ttt 2991 Ala Met Gly Asn Ser Leu Gly Gly Phe Ala Lys Tyr Trp Gln Ala Phe 560 565 570 cgt cag tat ccc cgt tta cag ggc ggc ttc gtc tgg gac tgg gtg gat 3039 Arg Gln Tyr Pro Arg Leu Gln Gly Gly Phe Val Trp Asp Trp Val Asp 575 580 585 590 cag tcg ctg att aaa tat gat gaa aac ggc aac ccg tgg tcg gct tac 3087 Gln Ser Leu Ile Lys Tyr Asp Glu Asn Gly Asn Pro Trp Ser Ala Tyr 595 600 605 ggc ggt gat ttt ggc gat acg ccg aac gat cgc cag ttc tgt atg aac 3135 Gly Gly Asp Phe Gly Asp Thr Pro Asn Asp Arg Gln Phe Cys Met Asn 610 615 620 ggt ctg gtc ttt gcc gac cgc acg ccg cat cca gcg ctg acg gaa gca 3183 Gly Leu Val Phe Ala Asp Arg Thr Pro His Pro Ala Leu Thr Glu Ala 625 630 635 aaa cac cag cag cag ttt ttc cag ttc cgt tta tcc ggg caa acc atc 3231 Lys His Gln Gln Gln Phe Phe Gln Phe Arg Leu Ser Gly Gln Thr Ile 640 645 650 gaa gtg acc agc gaa tac ctg ttc cgt cat agc gat aac gag ctc ctg 3279 Glu Val Thr Ser Glu Tyr Leu Phe Arg His Ser Asp Asn Glu Leu Leu 655 660 665 670 cac tgg atg gtg gcg ctg gat ggt aag ccg ctg gca agc ggt gaa gtg 3327 His Trp Met Val Ala Leu Asp Gly Lys Pro Leu Ala Ser Gly Glu Val 675 680 685 cct ctg gat gtc gct cca caa ggt aaa cag ttg att gaa ctg cct gaa 3375 Pro Leu Asp Val Ala Pro Gln Gly Lys Gln Leu Ile Glu Leu Pro Glu 690 695 700 cta ccg cag ccg gag agc gcc ggg caa ctc tgg ctc aca gta cgc gta 3423 Leu Pro Gln Pro Glu Ser Ala Gly Gln Leu Trp Leu Thr Val Arg Val 705 710 715 gtg caa ccg aac gcg acc gca tgg tca gaa gcc ggg cac atc agc gcc 3471 Val Gln Pro Asn Ala Thr Ala Trp Ser Glu Ala Gly His Ile Ser Ala 720 725 730 tgg cag cag tgg cgt ctg gcg gaa aac ctc agt gtg acg ctc ccc gcc 3519 Trp Gln Gln Trp Arg Leu Ala Glu Asn Leu Ser Val Thr Leu Pro Ala 735 740 745 750 gcg tcc cac gcc atc ccg cat ctg acc acc agc gaa atg gat ttt tgc 3567 Ala Ser His Ala Ile Pro His Leu Thr Thr Ser Glu Met Asp Phe Cys 755 760 765 atc gag ctg ggt aat aag cgt tgg caa ttt aac cgc cag tca ggc ttt 3615 Ile Glu Leu Gly Asn Lys Arg Trp Gln Phe Asn Arg Gln Ser Gly Phe 770 775 780 ctt tca cag atg tgg att ggc gat aaa aaa caa ctg ctg acg ccg ctg 3663 Leu Ser Gln Met Trp Ile Gly Asp Lys Lys Gln Leu Leu Thr Pro Leu 785 790 795 cgc gat cag ttc acc cgt gca ccg ctg gat aac gac att ggc gta agt 3711 Arg Asp Gln Phe Thr Arg Ala Pro Leu Asp Asn Asp Ile Gly Val Ser 800 805 810 gaa gcg acc cgc att gac cct aac gcc tgg gtc gaa cgc tgg aag gcg 3759 Glu Ala Thr Arg Ile Asp Pro Asn Ala Trp Val Glu Arg Trp Lys Ala 815 820 825 830 gcg ggc cat tac cag gcc gaa gca gcg ttg ttg cag tgc acg gca gat 3807 Ala Gly His Tyr Gln Ala Glu Ala Ala Leu Leu Gln Cys Thr Ala Asp 835 840 845 aca ctt gct gat gcg gtg ctg att acg acc gct cac gcg tgg cag cat 3855 Thr Leu Ala Asp Ala Val Leu Ile Thr Thr Ala His Ala Trp Gln His 850 855 860 cag ggg aaa acc tta ttt atc agc cgg aaa acc tac cgg att gat ggt 3903 Gln Gly Lys Thr Leu Phe Ile Ser Arg Lys Thr Tyr Arg Ile Asp Gly 865 870 875 agt ggt caa atg gcg att acc gtt gat gtt gaa gtg gcg agc gat aca 3951 Ser Gly Gln Met Ala Ile Thr Val Asp Val Glu Val Ala Ser Asp Thr 880 885 890 ccg cat ccg gcg cgg att ggc ctg aac tgc cag ctg gcg cag gta gca 3999 Pro His Pro Ala Arg Ile Gly Leu Asn Cys Gln Leu Ala Gln Val Ala 895 900 905 910 gag cgg gta aac tgg ctc gga tta ggg ccg caa gaa aac tat ccc gac 4047 Glu Arg Val Asn Trp Leu Gly Leu Gly Pro Gln Glu Asn Tyr Pro Asp 915 920 925 cgc ctt act gcc gcc tgt ttt gac cgc tgg gat ctg cca ttg tca gac 4095 Arg Leu Thr Ala Ala Cys Phe Asp Arg Trp Asp Leu Pro Leu Ser Asp 930 935 940 atg tat acc ccg tac gtc ttc ccg agc gaa aac ggt ctg cgc tgc ggg 4143 Met Tyr Thr Pro Tyr Val Phe Pro Ser Glu Asn Gly Leu Arg Cys Gly 945 950 955 acg cgc gaa ttg aat tat ggc cca cac cag tgg cgc ggc gac ttc cag 4191 Thr Arg Glu Leu Asn Tyr Gly Pro His Gln Trp Arg Gly Asp Phe Gln 960 965 970 ttc aac atc agc cgc tac agt caa cag caa ctg atg gaa acc agc cat 4239 Phe Asn Ile Ser Arg Tyr Ser Gln Gln Gln Leu Met Glu Thr Ser His 975 980 985 990 cgc cat ctg ctg cac gcg gaa gaa ggc aca tgg ctg aat atc gac ggt 4287 Arg His Leu Leu His Ala Glu Glu Gly Thr Trp Leu Asn Ile Asp Gly 995 1000 1005 ttc cat atg ggg att ggt ggc gac gac tcc tgg agc ccg tca gta tcg 4335 Phe His Met Gly Ile Gly Gly Asp Asp Ser Trp Ser Pro Ser Val Ser 1010 1015 1020 gcg gaa ttc cag ctg agc gcc ggt cgc tac cat tac cag ttg gtc tgg 4383 Ala Glu Phe Gln Leu Ser Ala Gly Arg Tyr His Tyr Gln Leu Val Trp 1025 1030 1035 tgt caa aaa taa taattaatta atcaccgttg tgctcacttc tcaaatcgac 4435 Cys Gln Lys 1040 aaaggaaaca cacttcgtgc agcatgtgcc ccattataaa gaaactgagt tgttccgctg 4495 tggcttgcag gtgtcacatc cacaaaaacc ggccgactct aaataggagt gtttcgcagc 4555 aagcagcgaa agtttatgac tgggtccgaa tctctgaacg gatgtgtggc ggacctggct 4615 gatgttgatc gccgtcgacg gtatcgataa gcttgatatg catgtccgcg ttcgtgaaat 4675 tctctgatca agcggagtga tcaccaatca tcgtctcagc gggatgacgt tgcggcaagg 4735 gccggctcgc ggtgggcagt cagatgccga acgtaactca ggacggcttg cgctcatcgc 4795 agaacagggg tggtgcctgc attgggtgcg gttggtgatc ctggttggac cggtggagat 4855 gcgcgcgcac gaaggggatg tgtcagaaac attttgtttg ttctctgtga acttttagat 4915 gtgttaaagg cggcgaatat tagcagagag tcctccttgt tccattctct cttgaatttc 4975 gccctttcct tctctttgcg agtgtggtag agaacaagca ctcgttcgcc gtccctgacg 5035 acgcaacccg cgcagaagac atccaccaaa cggtgttaca caatcacctt gtgtgaagtt 5095 cttgcggaaa actactcgtt ggcatttttt cttgaattcc ctttttcgac aaaatgcatg 5155 agaaaaaaat cactggatat accaccgttg atatatccca atcgcatcgt aaagaacatt 5215 ttgaggcatt tcagtcagtt gctcaatgta cctataacca gaccgttcag ctggatatta 5275 cggccttttt aaagaccgta aagaaaaata agcacaagtt ttatccggcc tttattcaca 5335 ttcttgcccg cctgatgaat gctcatccgg aattccgtat ggcaatgaaa gacggtgagc 5395 tggtgatatg ggatagtgtt cacccttgtt acaccgtttt ccatgagcaa actgaaacgt 5455 tttcatcgct ctggagtgaa taccacgacg atttccggca gtttctacac atatattcgc 5515 aagatgtggc gtgttacggt gaaaacctgg cctatttccc taaagggttt attgagaata 5575 tgtttttcgt ctcagccaat ccctgggtga gtttcaccag ttttgattta aacgtggcca 5635 atatggacaa cttcttcgcc cccgttttca ccatgggcaa atattatacg caaggcgaca 5695 aggtgctgat gccgctggcg attcaggttc atcatgccgt ctgtgatggc ttccatgtcg 5755 gcagaatgct taatgaatta caacagtact gcgatgagtg gcagggcggg gcttaattaa 5815 tcaccgttgt gctcacttct caaatcgaca aaggaaacac acttcgtgca gcatgtgccc 5875 cattataaag aaactgagtt gttccgctgt ggcttgcagg tgtcacatcc acaaaaaccg 5935 gccgactcta aataggagtg tttcgcagca agcagcgaaa gtttatgact gggtccgaat 5995 ctctgaacgg atgtgtggcg gacctggctg atgttgatcg ccgtcgacac acgcgccaca 6055 tgggtcaata cacaagacag ctatcagttg ttttagtcga accggttaac acaattcttg 6115 cccccccgag ggggatccac tagagcggcc gccaccgcgg tggagctcca gcttttgttc 6175 cctttagtga gggttaattg cgcgcttggc gtaatcatgg tcatagctgt ttcctgtgtg 6235 aaattgttat ccgctcacaa ttccacacaa catacgagcc ggaagcataa agtgtaaagc 6295 ctggggtgcc taatgagtga gctaactcac attaattgcg ttgcgctcac tgcccgcttt 6355 ccagtcggga aacctgtcgt gccagctgca ttaatgaatc ggccaacgcg cggggagagg 6415 cggtttgcgt attgggcgct cttccgcttc ctcgctcact gactcgctgc gctcggtcgt 6475 tcggctgcgg cgagcggtat cagctcactc aaaggcggta atacggttat ccacagaatc 6535 aggggataac gcaggaaaga acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa 6595 aaaggccgcg ttgctggcgt ttttccatag gctccgcccc cctgacgagc atcacaaaaa 6655 tcgacgctca agtcagaggt ggcgaaaccc gacaggacta taaagatacc aggcgtttcc 6715 ccctggaagc tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg gatacctgtc 6775 cgcctttctc ccttcgggaa gcgtggcgct ttctcatagc tcacgctgta ggtatctcag 6835 ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga 6895 ccgctgcgcc ttatccggta actatcgtct tgagtccaac ccggtaagac acgacttatc 6955 gccactggca gcagccactg gtaacaggat tagcagagcg aggtatgtag gcggtgctac 7015 agagttcttg aagtggtggc ctaactacgg ctacactaga aggacagtat ttggtatctg 7075 cgctctgctg aagccagtta ccttcggaaa aagagttggt agctcttgat ccggcaaaca 7135 aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa 7195 aggatctcaa gaagatcctt tgatcttttc tacggggtct gacgctcagt ggaacgaaaa 7255 ctcacgttaa gggattttgg tcatgagatt atcaaaaagg atcttcacct agatcctttt 7315 aaattaaaaa tgaagtttta aatcaatcta aagtatatat gagtaaactt ggtctgacag 7375 ttaccaatgc ttaatcagtg aggcacctat ctcagcgatc tgtctatttc gttcatccat 7435 agttgcctga ctccccgtcg tgtagataac tacgatacgg gagggcttac catctggccc 7495 cagtgctgca atgataccgc gagacccacg ctcaccggct ccagatttat cagcaataaa 7555 ccagccagcc ggaagggccg agcgcagaag tggtcctgca actttatccg cctccatcca 7615 gtctattaat tgttgccggg aagctagagt aagtagttcg ccagttaata gtttgcgcaa 7675 cgttgttgcc attgctacag gcatcgtggt gtcacgctcg tcgtttggta tggcttcatt 7735 cagctccggt tcccaacgat caaggcgagt tacatgatcc cccatgttgt gcaaaaaagc 7795 ggttagctcc ttcggtcctc cgatcgttgt cagaagtaag ttggccgcag tgttatcact 7855 catggttatg gcagcactgc ataattctct tactgtcatg ccatccgtaa gatgcttttc 7915 tgtgactggt gagtactcaa ccaagtcatt ctgagaatag tgtatgcggc gaccgagttg 7975 ctcttgcccg gcgtcaatac gggataatac cgcgccacat agcagaactt taaaagtgct 8035 catcattgga aaacgttctt cggggcgaaa actctcaagg atcttaccgc tgttgagatc 8095 cagttcgatg taacccactc gtgcacccaa ctgatcttca gcatctttta ctttcaccag 8155 cgtttctggg tgagcaaaaa caggaaggca aaatgccgca aaaaagggaa taagggcgac 8215 acggaaatgt tgaatactca tactcttcct ttttcaatat tattgaagca tttatcaggg 8275 ttattgtctc atgagcggat acatatttga atgtatttag aaaaataaac aaataggggt 8335 tccgcgcaca tttccccgaa aagtgccac 8364 17 1041 PRT Artificial Sequence Description of Artificial Sequence pTetO7Sag4-LacZ 17 Met His Ala Met Glu Lys Leu Leu Phe Arg Ser Ser Tyr Ser Leu Glu 1 5 10 15 Ser Ile Gly Thr Ser Ser Leu Ala Leu Ala Val Val Leu Gln Arg Arg 20 25 30 Asp Trp Glu Asn Pro Gly Val Thr Gln Leu Asn Arg Leu Ala Ala His 35 40 45 Pro Pro Phe Ala Ser Trp Arg Asn Ser Glu Glu Ala Arg Thr Asp Arg 50 55 60 Pro Ser Gln Gln Leu Arg Ser Leu Asn Gly Glu Trp Arg Phe Ala Trp 65 70 75 80 Phe Pro Ala Pro Glu Ala Val Pro Glu Ser Trp Leu Glu Cys Asp Leu 85 90 95 Pro Glu Ala Asp Thr Val Val Val Pro Ser Asn Trp Gln Met His Gly 100 105 110 Tyr Asp Ala Pro Ile Tyr Thr Asn Val Thr Tyr Pro Ile Thr Val Asn 115 120 125 Pro Pro Phe Val Pro Thr Glu Asn Pro Thr Gly Cys Tyr Ser Leu Thr 130 135 140 Phe Asn Val Asp Glu Ser Trp Leu Gln Glu Gly Gln Thr Arg Ile Ile 145 150 155 160 Phe Asp Gly Val Asn Ser Ala Phe His Leu Trp Cys Asn Gly Arg Trp 165 170 175 Val Gly Tyr Gly Gln Asp Ser Arg Leu Pro Ser Glu Phe Asp Leu Ser 180 185 190 Ala Phe Leu Arg Ala Gly Glu Asn Arg Leu Ala Val Met Val Leu Arg 195 200 205 Trp Ser Asp Gly Ser Tyr Leu Glu Asp Gln Asp Met Trp Arg Met Ser 210 215 220 Gly Ile Phe Arg Asp Val Ser Leu Leu His Lys Pro Thr Thr Gln Ile 225 230 235 240 Ser Asp Phe His Val Ala Thr Arg Phe Asn Asp Asp Phe Ser Arg Ala 245 250 255 Val Leu Glu Ala Glu Val Gln Met Cys Gly Glu Leu Arg Asp Tyr Leu 260 265 270 Arg Val Thr Val Ser Leu Trp Gln Gly Glu Thr Gln Val Ala Ser Gly 275 280 285 Thr Ala Pro Phe Gly Gly Glu Ile Ile Asp Glu Arg Gly Gly Tyr Ala 290 295 300 Asp Arg Val Thr Leu Arg Leu Asn Val Glu Asn Pro Lys Leu Trp Ser 305 310 315 320 Ala Glu Ile Pro Asn Leu Tyr Arg Ala Val Val Glu Leu His Thr Ala 325 330 335 Asp Gly Thr Leu Ile Glu Ala Glu Ala Cys Asp Val Gly Phe Arg Glu 340 345 350 Val Arg Ile Glu Asn Gly Leu Leu Leu Leu Asn Gly Lys Pro Leu Leu 355 360 365 Ile Arg Gly Val Asn Arg His Glu His His Pro Leu His Gly Gln Val 370 375 380 Met Asp Glu Gln Thr Met Val Gln Asp Ile Leu Leu Met Lys Gln Asn 385 390 395 400 Asn Phe Asn Ala Val Arg Cys Ser His Tyr Pro Asn His Pro Leu Trp 405 410 415 Tyr Thr Leu Cys Asp Arg Tyr Gly Leu Tyr Val Val Asp Glu Ala Asn 420 425 430 Ile Glu Thr His Gly Met Val Pro Met Asn Arg Leu Thr Asp Asp Pro 435 440 445 Arg Trp Leu Pro Ala Met Ser Glu Arg Val Thr Arg Met Val Gln Arg 450 455 460 Asp Arg Asn His Pro Ser Val Ile Ile Trp Ser Leu Gly Asn Glu Ser 465 470 475 480 Gly His Gly Ala Asn His Asp Ala Leu Tyr Arg Trp Ile Lys Ser Val 485 490 495 Asp Pro Ser Arg Pro Val Gln Tyr Glu Gly Gly Gly Ala Asp Thr Thr 500 505 510 Ala Thr Asp Ile Ile Cys Pro Met Tyr Ala Arg Val Asp Glu Asp Gln 515 520 525 Pro Phe Pro Ala Val Pro Lys Trp Ser Ile Lys Lys Trp Leu Ser Leu 530 535 540 Pro Gly Glu Thr Arg Pro Leu Ile Leu Cys Glu Tyr Ala His Ala Met 545 550 555 560 Gly Asn Ser Leu Gly Gly Phe Ala Lys Tyr Trp Gln Ala Phe Arg Gln 565 570 575 Tyr Pro Arg Leu Gln Gly Gly Phe Val Trp Asp Trp Val Asp Gln Ser 580 585 590 Leu Ile Lys Tyr Asp Glu Asn Gly Asn Pro Trp Ser Ala Tyr Gly Gly 595 600 605 Asp Phe Gly Asp Thr Pro Asn Asp Arg Gln Phe Cys Met Asn Gly Leu 610 615 620 Val Phe Ala Asp Arg Thr Pro His Pro Ala Leu Thr Glu Ala Lys His 625 630 635 640 Gln Gln Gln Phe Phe Gln Phe Arg Leu Ser Gly Gln Thr Ile Glu Val 645 650 655 Thr Ser Glu Tyr Leu Phe Arg His Ser Asp Asn Glu Leu Leu His Trp 660 665 670 Met Val Ala Leu Asp Gly Lys Pro Leu Ala Ser Gly Glu Val Pro Leu 675 680 685 Asp Val Ala Pro Gln Gly Lys Gln Leu Ile Glu Leu Pro Glu Leu Pro 690 695 700 Gln Pro Glu Ser Ala Gly Gln Leu Trp Leu Thr Val Arg Val Val Gln 705 710 715 720 Pro Asn Ala Thr Ala Trp Ser Glu Ala Gly His Ile Ser Ala Trp Gln 725 730 735 Gln Trp Arg Leu Ala Glu Asn Leu Ser Val Thr Leu Pro Ala Ala Ser 740 745 750 His Ala Ile Pro His Leu Thr Thr Ser Glu Met Asp Phe Cys Ile Glu 755 760 765 Leu Gly Asn Lys Arg Trp Gln Phe Asn Arg Gln Ser Gly Phe Leu Ser 770 775 780 Gln Met Trp Ile Gly Asp Lys Lys Gln Leu Leu Thr Pro Leu Arg Asp 785 790 795 800 Gln Phe Thr Arg Ala Pro Leu Asp Asn Asp Ile Gly Val Ser Glu Ala 805 810 815 Thr Arg Ile Asp Pro Asn Ala Trp Val Glu Arg Trp Lys Ala Ala Gly 820 825 830 His Tyr Gln Ala Glu Ala Ala Leu Leu Gln Cys Thr Ala Asp Thr Leu 835 840 845 Ala Asp Ala Val Leu Ile Thr Thr Ala His Ala Trp Gln His Gln Gly 850 855 860 Lys Thr Leu Phe Ile Ser Arg Lys Thr Tyr Arg Ile Asp Gly Ser Gly 865 870 875 880 Gln Met Ala Ile Thr Val Asp Val Glu Val Ala Ser Asp Thr Pro His 885 890 895 Pro Ala Arg Ile Gly Leu Asn Cys Gln Leu Ala Gln Val Ala Glu Arg 900 905 910 Val Asn Trp Leu Gly Leu Gly Pro Gln Glu Asn Tyr Pro Asp Arg Leu 915 920 925 Thr Ala Ala Cys Phe Asp Arg Trp Asp Leu Pro Leu Ser Asp Met Tyr 930 935 940 Thr Pro Tyr Val Phe Pro Ser Glu Asn Gly Leu Arg Cys Gly Thr Arg 945 950 955 960 Glu Leu Asn Tyr Gly Pro His Gln Trp Arg Gly Asp Phe Gln Phe Asn 965 970 975 Ile Ser Arg Tyr Ser Gln Gln Gln Leu Met Glu Thr Ser His Arg His 980 985 990 Leu Leu His Ala Glu Glu Gly Thr Trp Leu Asn Ile Asp Gly Phe His 995 1000 1005 Met Gly Ile Gly Gly Asp Asp Ser Trp Ser Pro Ser Val Ser Ala Glu 1010 1015 1020 Phe Gln Leu Ser Ala Gly Arg Tyr His Tyr Gln Leu Val Trp Cys Gln 1025 1030 1035 1040 Lys 18 8398 DNA Artificial Sequence CDS (1167)..(3209) Description of Artificial Sequence Ptub8TetR-GCN5-DHFRTS 18 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga 60 ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct tcctttctcg 120 ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta gggttccgat 180 ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt tcacgtagtg 240 ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg ttctttaata 300 gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat tcttttgatt 360 tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat 420 ttaacgcgaa ttttaacaaa atattaacgc ttacaatttc cattcgccat tcaggctgcg 480 caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540 gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt cacgacgttg 600 taaaacgacg gccagtgagc gcgcgtaata cgactcacta tagggcgaat tgggtaccgg 660 gccccccctc gacggtatcg ataagcttaa ccacaaacct tgagacgcgt gttccaacca 720 cgcaccctga cacgcgtgtt ccaaccacgc accctgagac gcgtgttcta accacgcacc 780 ctgagacgcg tgttctaacc acgcaccctg agacgcgtgt tcaagcttgc ctgcattggg 840 tgcggttggt gatcctggtt ggaccggtgg agatgcgcgc gcacgaaggg gatgtgtcag 900 aaacattttg tttgttctct gtgaactttt agatgtgtta aaggcggcga atattagcag 960 agagtcctcc ttgttccatt ctctcttgaa tttcgccctt tccttctctt tgcgagtgtg 1020 gtagagaaca agcactcgtt cgccgtccct gacgacgcaa cccgcgcaga agacatccac 1080 caaacggtgt tacacaatca ccttgtgtga agttcttgcg gaaaactact cgttggcatt 1140 ttttcttgaa ttcctttttc gacaaa atg tcg cgc ctg gac aag agc aaa gtc 1193 Met Ser Arg Leu Asp Lys Ser Lys Val 1 5 atc aac tct gct ctg gaa tta ctc aat gaa gtc ggt atc gaa ggc ctg 1241 Ile Asn Ser Ala Leu Glu Leu Leu Asn Glu Val Gly Ile Glu Gly Leu 10 15 20 25 acg aca agg aaa ctc gct caa aag ctg gga gtt gag cag cct acc ctg 1289 Thr Thr Arg Lys Leu Ala Gln Lys Leu Gly Val Glu Gln Pro Thr Leu 30 35 40 tac tgg cac gtg aag aac aag cgg gcc ctg ctc gat gcc ctg gca atc 1337 Tyr Trp His Val Lys Asn Lys Arg Ala Leu Leu Asp Ala Leu Ala Ile 45 50 55 gag atg ctg gac agg cat cat acc cac ttc tgc ccc ctg gaa ggc gag 1385 Glu Met Leu Asp Arg His His Thr His Phe Cys Pro Leu Glu Gly Glu 60 65 70 tca tgg caa gac ttt ctg cgg aac aac gcc aag tca ttc cgc tgt gct 1433 Ser Trp Gln Asp Phe Leu Arg Asn Asn Ala Lys Ser Phe Arg Cys Ala 75 80 85 ctc ctc tca cat cgc gac ggg gct aaa gtg cat ctc ggc acc cgc cca 1481 Leu Leu Ser His Arg Asp Gly Ala Lys Val His Leu Gly Thr Arg Pro 90 95 100 105 aca gag aaa cag tac gaa acc ctg gaa aat cag ctc gcg ttc ctg tgt 1529 Thr Glu Lys Gln Tyr Glu Thr Leu Glu Asn Gln Leu Ala Phe Leu Cys 110 115 120 cag caa ggc ttc tcc ctg gag aac gca ctg tac gct ctg tcc gcc gtg 1577 Gln Gln Gly Phe Ser Leu Glu Asn Ala Leu Tyr Ala Leu Ser Ala Val 125 130 135 ggc cac ttt aca ctg ggc tgc gta ttg gag gat cag gag cat caa gta 1625 Gly His Phe Thr Leu Gly Cys Val Leu Glu Asp Gln Glu His Gln Val 140 145 150 gca aaa gag gaa aga gag aca cct acc acc gat tct atg ccc cca ctt 1673 Ala Lys Glu Glu Arg Glu Thr Pro Thr Thr Asp Ser Met Pro Pro Leu 155 160 165 ctg aga caa gca att gag ctg ttc gac cat cag gga gcc gaa cct gcc 1721 Leu Arg Gln Ala Ile Glu Leu Phe Asp His Gln Gly Ala Glu Pro Ala 170 175 180 185 ttc ctt ttc ggc ctg gaa cta atc ata tgt ggc ctg gag aaa cag ctg 1769 Phe Leu Phe Gly Leu Glu Leu Ile Ile Cys Gly Leu Glu Lys Gln Leu 190 195 200 aag tgc gaa agc ggt atg cat aaa ggc gct cca aca ggt ctg ggg ctc 1817 Lys Cys Glu Ser Gly Met His Lys Gly Ala Pro Thr Gly Leu Gly Leu 205 210 215 gcg tct ttc ttc ggg aag tcg ttc att ttt cac aca ttg cat gcg gcg 1865 Ala Ser Phe Phe Gly Lys Ser Phe Ile Phe His Thr Leu His Ala Ala 220 225 230 ttg ccc gcg ctg ttg gag gaa ctc gcc aac acc gtc gtg ggg acg gag 1913 Leu Pro Ala Leu Leu Glu Glu Leu Ala Asn Thr Val Val Gly Thr Glu 235 240 245 ctg cga cgc ttc gtt ctc gcc ctc gca gcc gcc gtc ggc ctc tcc agt 1961 Leu Arg Arg Phe Val Leu Ala Leu Ala Ala Ala Val Gly Leu Ser Ser 250 255 260 265 tcg cat gca gag gag ctt ctc cac cgg gcc gtc gct gtt cgt tcc agt 2009 Ser His Ala Glu Glu Leu Leu His Arg Ala Val Ala Val Arg Ser Ser 270 275 280 cgc ctc gaa tcg atc ttg ccg tcg gag aca ggc ttg ggg ttt ctg cac 2057 Arg Leu Glu Ser Ile Leu Pro Ser Glu Thr Gly Leu Gly Phe Leu His 285 290 295 cgc gac gcg gga gga gcg cgc gaa gag gaa ctc gga atc atc agc ttc 2105 Arg Asp Ala Gly Gly Ala Arg Glu Glu Glu Leu Gly Ile Ile Ser Phe 300 305 310 tgc tgc gtg acg aac gac agg caa ccg ttg cac atg cgg cac ttg gtg 2153 Cys Cys Val Thr Asn Asp Arg Gln Pro Leu His Met Arg His Leu Val 315 320 325 acg gtg aaa aac atc ttc tct cga cag ctc ccg aag atg ccg cga gag 2201 Thr Val Lys Asn Ile Phe Ser Arg Gln Leu Pro Lys Met Pro Arg Glu 330 335 340 345 tac atc gtc cgg ctc gtt ttt gac cga gcc cac ttc acc ttc tgt ctc 2249 Tyr Ile Val Arg Leu Val Phe Asp Arg Ala His Phe Thr Phe Cys Leu 350 355 360 tgc aag caa ggc cgc gtc atc gga gga gtc tgc ttc cgc ccc tac ttc 2297 Cys Lys Gln Gly Arg Val Ile Gly Gly Val Cys Phe Arg Pro Tyr Phe 365 370 375 cgc gaa aaa ttc gcg gaa att gct ttc ctc gcg gtg aca tct act gag 2345 Arg Glu Lys Phe Ala Glu Ile Ala Phe Leu Ala Val Thr Ser Thr Glu 380 385 390 cag gtc aag ggt tac ggg acg cgt ctc atg aat cat ctc aag gaa cat 2393 Gln Val Lys Gly Tyr Gly Thr Arg Leu Met Asn His Leu Lys Glu His 395 400 405 gtg aag aaa tct gga atc gaa tat ttc ctc acc tac gca gac aac ttt 2441 Val Lys Lys Ser Gly Ile Glu Tyr Phe Leu Thr Tyr Ala Asp Asn Phe 410 415 420 425 gca gtg ggg tat ttc cgt aag cag ggc ttc agc agc aag ata acg atg 2489 Ala Val Gly Tyr Phe Arg Lys Gln Gly Phe Ser Ser Lys Ile Thr Met 430 435 440 ccg cga gac cgt tgg ttg ggc tac atc aag gac tac gac ggc ggt acg 2537 Pro Arg Asp Arg Trp Leu Gly Tyr Ile Lys Asp Tyr Asp Gly Gly Thr 445 450 455 ttg atg gag tgt cgt ctc agc acc cga ata aat tac ctg aaa ctt tcg 2585 Leu Met Glu Cys Arg Leu Ser Thr Arg Ile Asn Tyr Leu Lys Leu Ser 460 465 470 cag ctc ctc gcc cta cag aaa ctc gca gtg aag cga cgc att gag caa 2633 Gln Leu Leu Ala Leu Gln Lys Leu Ala Val Lys Arg Arg Ile Glu Gln 475 480 485 tcc gcg cct tca gtc gtc tgt cct tct ctc tct ttc tgg aag gaa aat 2681 Ser Ala Pro Ser Val Val Cys Pro Ser Leu Ser Phe Trp Lys Glu Asn 490 495 500 505 cca ggt cag ctg ttg atg ccg tcg gcc att ccg ggc ttg gcc gaa cta 2729 Pro Gly Gln Leu Leu Met Pro Ser Ala Ile Pro Gly Leu Ala Glu Leu 510 515 520 aac aag aat ggc gag ctg tct ctg ctg ctg tct tcg ggg cgc gtc ggg 2777 Asn Lys Asn Gly Glu Leu Ser Leu Leu Leu Ser Ser Gly Arg Val Gly 525 530 535 gcc gcg ccc caa ggg tca ggg gcc ctt ccc ggc ggg cgc acg ggc gcc 2825 Ala Ala Pro Gln Gly Ser Gly Ala Leu Pro Gly Gly Arg Thr Gly Ala 540 545 550 ttg ggc tcc aaa aag ggc cct ttc ggg cgc gcg ggc ttc gcg aag ggc 2873 Leu Gly Ser Lys Lys Gly Pro Phe Gly Arg Ala Gly Phe Ala Lys Gly 555 560 565 gaa aag ggc ctg cgc gct gcg tca ctc aag gcg cag att gcg gcg ctt 2921 Glu Lys Gly Leu Arg Ala Ala Ser Leu Lys Ala Gln Ile Ala Ala Leu 570 575 580 585 ctg tca act ctg gaa aag cat tct tcc tct tgg ccc ttc cgg cga cct 2969 Leu Ser Thr Leu Glu Lys His Ser Ser Ser Trp Pro Phe Arg Arg Pro 590 595 600 gtc tcg gtc agc gag gcc ccc gac tac tac gag gtc gtg cga aga ccg 3017 Val Ser Val Ser Glu Ala Pro Asp Tyr Tyr Glu Val Val Arg Arg Pro 605 610 615 atc gac atc agc acc atg aaa aaa cga aat cga aat ggg gac tac aga 3065 Ile Asp Ile Ser Thr Met Lys Lys Arg Asn Arg Asn Gly Asp Tyr Arg 620 625 630 acg aag gag gcg ttc cag gaa gat ctg ctg ctg atg ttc gac aac tgt 3113 Thr Lys Glu Ala Phe Gln Glu Asp Leu Leu Leu Met Phe Asp Asn Cys 635 640 645 cgc gtg tac aat tcg ccc gac aca att tac tac aag tac gca gac gag 3161 Arg Val Tyr Asn Ser Pro Asp Thr Ile Tyr Tyr Lys Tyr Ala Asp Glu 650 655 660 665 ctc cag gcc ttc atc tgg ccc aag gtc gag gct ctc ggg agt ttt taa 3209 Leu Gln Ala Phe Ile Trp Pro Lys Val Glu Ala Leu Gly Ser Phe 670 675 680 ttaatcaccg ttgtgctcac ttctcaaatc gacaaaggaa acacacttcg tgcagcatgt 3269 gccccattat aaagaaactg agttgttccg ctgtggcttg caggtgtcac atccacaaaa 3329 accggccgac tctaaatagg agtgtttcgc agcaagcagc gaaagtttat gactgggtcc 3389 gaatctctga acggatgtgt ggcggacctg gctgatgttg atcgccgtcg acacacgcgc 3449 cacatgggtc aatacacaag acagctatca gttgttttag tcgaaccggt taacacaatt 3509 cttgcccccc cgagggggat ccactagttc tagagcggcc gctctagaac tagtggatcg 3569 atcccccggg ctgcaggaat tcatcctgca agtgcataga aggaaagttg tctgctgtcg 3629 tgggcagaca gcaacagtcc agcactctag cggcatacag aacgataacg cattcacgag 3689 tggatacacg cacatctgcg tcacccgcaa ctcgctttcg ttctgattga caaaaagaaa 3749 acaaggcgag gtgagactgt gtgaaatgcc acatgaagag tcatcccttt tcttcgataa 3809 aggacacagg ggtctctggc accccctcgt cagctctctc cgacccgagg cactctccct 3869 gatccctccg aaaagagagg aaaacgagag acgggcagct tctgtatttc cgctagacag 3929 ccatctccat ctggattcgt ccgtgcggga cgtagcccac gacctcaaaa tcctcggcgg 3989 tgaaatcgtc gatttccttg atgcgttcct tgttgaggat gttcacaatg gggaacggtc 4049 tcggttctct ccgcagctgc tcttttaaag cctcgacatg gttcgtgtag acatgcgtgt 4109 tccccatgaa gtgaatgaac tccttaggtt ttaggttgca gacgtgtgca accatgagcg 4169 tcaaaagcga ataggaagcg atgttgaagg ggacgccgag gccgacatcg cacgaccgct 4229 gatacatgat gcacgacagc tccttctggt cgttcacgta gaactggcac aacaagtgac 4289 aaggcggcag cgccatttcg tccagcgctg caggattcca ggcagtcatg agcatgcgac 4349 gatctgttgg attcgttctc agcatctgga tcacattctt cagctggtcg acgccctgcc 4409 ctgtgtagtc tgtgtgcatg tctttgtatg ccgcgccgaa gtgtctccac tggaagccgt 4469 agcccgggcc gatgtctccg acctctcggt gggggagatt gcgcgaatcg aggaactcgc 4529 gtgtcacatt cttgtcccag atcttcacgc ccttctcaga aagatggttt gcgttcgtgt 4589 cgccgcgaat gaaccacagc aactcttcga ggaccccttt ccagaacaca cgctttgtgg 4649 tgagaagtgg aaaggcctga tccagcgagt agcgcatagt gcagccgaat ttggagatga 4709 caccaacgcc cgttcggtca tccattgtcc ttccattgtt aataatgtcg gcaatgagat 4769 caaggtactg gaattcttca tggcctctaa agtgaacatg cggaacggcc cgaatcagtt 4829 ccttttgctc gcgttttttc cggtcttctt cgtccatcca cgccaacacc ggggcaatgg 4889 ctgcggccga agaaggagcc tgcaacccgt gcacgggagt tgtctccctc gtggacgtca 4949 aggagctcat tgcgttgctc ggttccgcag tggctgcgtc gtcagtcttc cttctcttct 5009 cgagaaccac aaagtcgtag ggtaccccgt tgtctgagaa ggtcttggaa atgaagatgg 5069 gtcgatacgt cgcttcattg tccttctctc ttccgagctc cggacaaaag ggaacgaaca 5129 cagactcggc aggagctgca gcctgcgcag cagttgattt gtttgaaaga atgtcatctc 5189 cggggaacgc agggaagaaa acgtcgcacg gaaactcgcg ggctacacgc gtgatgtaca 5249 ggtgagaggc aacgcccaga gacagcgctg cctcgtacag tcccgctcct cccacgacaa 5309 aaatctggtc gacagaatcc ttgtactctt cctccagaag gctgagagct gctgggagtg 5369 aagcacagac tcggacgcgc tgctggcctt cagcttgagg cttctccgcc gcaatgtctt 5429 cttctttgag ggaagaggaa acgacgatgt tcaatctgtc cacgaggggt ctaaactttc 5489 gaggcatgct ttcccaggtt ttccgtccca tgacaacggc gttgaatctc ttgccgactg 5549 atggagaggg aagtccagag tcgcccgtct ttgcaaattt cctgggaagc cacccgttca 5609 ggcgactggc ttcttcgggc gtcgtttttg tcacacgaga aaagtgtttg aaatctgtgg 5669 tcaagtgggg ccacgggagg ccgttgttga tgccgatgcc cctcttgggg gtcatcgcga 5729 cgaccagaca caccggtttc tgcatcttcc cagacacgac aacgccccgt agagcagaaa 5789 cgcactacta aagcgaaact tcacccgtcc ctgctgcact cagagcagtg ctccgcactg 5849 ccgtgtggta aaatgaaaag gttctacgag acacgcgtct ccggatcgac aagcgaagga 5909 tctgcacacc tggtctcgat gtcgaacaaa gcacggagga gagacggaaa gtgcttacat 5969 cgaacacggt tatcaaaccc gagaaaaaga aacgaacaga agaaaaagga aacctccgca 6029 tacttttaaa gaatgaagtt ccccgatttt cccaaaaatg gcgtcatttt cgcgcacggc 6089 agtcagataa caggtgtagc ggctgcccac caacagagac ggcgcggccg acaggacgct 6149 actgggactg cgaacagcag caagatcgga tcttccgcgg tggagctcca gcttttgttc 6209 cctttagtga gggttaattg cgcgcttggc gtaatcatgg tcatagctgt ttcctgtgtg 6269 aaattgttat ccgctcacaa ttccacacaa catacgagcc ggaagcataa agtgtaaagc 6329 ctggggtgcc taatgagtga gctaactcac attaattgcg ttgcgctcac tgcccgcttt 6389 ccagtcggga aacctgtcgt gccagctgca ttaatgaatc ggccaacgcg cggggagagg 6449 cggtttgcgt attgggcgct cttccgcttc ctcgctcact gactcgctgc gctcggtcgt 6509 tcggctgcgg cgagcggtat cagctcactc aaaggcggta atacggttat ccacagaatc 6569 aggggataac gcaggaaaga acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa 6629 aaaggccgcg ttgctggcgt ttttccatag gctccgcccc cctgacgagc atcacaaaaa 6689 tcgacgctca agtcagaggt ggcgaaaccc gacaggacta taaagatacc aggcgtttcc 6749 ccctggaagc tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg gatacctgtc 6809 cgcctttctc ccttcgggaa gcgtggcgct ttctcatagc tcacgctgta ggtatctcag 6869 ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga 6929 ccgctgcgcc ttatccggta actatcgtct tgagtccaac ccggtaagac acgacttatc 6989 gccactggca gcagccactg gtaacaggat tagcagagcg aggtatgtag gcggtgctac 7049 agagttcttg aagtggtggc ctaactacgg ctacactaga aggacagtat ttggtatctg 7109 cgctctgctg aagccagtta ccttcggaaa aagagttggt agctcttgat ccggcaaaca 7169 aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa 7229 aggatctcaa gaagatcctt tgatcttttc tacggggtct gacgctcagt ggaacgaaaa 7289 ctcacgttaa gggattttgg tcatgagatt atcaaaaagg atcttcacct agatcctttt 7349 aaattaaaaa tgaagtttta aatcaatcta aagtatatat gagtaaactt ggtctgacag 7409 ttaccaatgc ttaatcagtg aggcacctat ctcagcgatc tgtctatttc gttcatccat 7469 agttgcctga ctccccgtcg tgtagataac tacgatacgg gagggcttac catctggccc 7529 cagtgctgca atgataccgc gagacccacg ctcaccggct ccagatttat cagcaataaa 7589 ccagccagcc ggaagggccg agcgcagaag tggtcctgca actttatccg cctccatcca 7649 gtctattaat tgttgccggg aagctagagt aagtagttcg ccagttaata gtttgcgcaa 7709 cgttgttgcc attgctacag gcatcgtggt gtcacgctcg tcgtttggta tggcttcatt 7769 cagctccggt tcccaacgat caaggcgagt tacatgatcc cccatgttgt gcaaaaaagc 7829 ggttagctcc ttcggtcctc cgatcgttgt cagaagtaag ttggccgcag tgttatcact 7889 catggttatg gcagcactgc ataattctct tactgtcatg ccatccgtaa gatgcttttc 7949 tgtgactggt gagtactcaa ccaagtcatt ctgagaatag tgtatgcggc gaccgagttg 8009 ctcttgcccg gcgtcaatac gggataatac cgcgccacat agcagaactt taaaagtgct 8069 catcattgga aaacgttctt cggggcgaaa actctcaagg atcttaccgc tgttgagatc 8129 cagttcgatg taacccactc gtgcacccaa ctgatcttca gcatctttta ctttcaccag 8189 cgtttctggg tgagcaaaaa caggaaggca aaatgccgca aaaaagggaa taagggcgac 8249 acggaaatgt tgaatactca tactcttcct ttttcaatat tattgaagca tttatcaggg 8309 ttattgtctc atgagcggat acatatttga atgtatttag aaaaataaac aaataggggt 8369 tccgcgcaca tttccccgaa aagtgccac 8398 19 680 PRT Artificial Sequence Description of Artificial Sequence Ptub8TetR-GCN5-DHFRTS 19 Met Ser Arg Leu Asp Lys Ser Lys Val Ile Asn Ser Ala Leu Glu Leu 1 5 10 15 Leu Asn Glu Val Gly Ile Glu Gly Leu Thr Thr Arg Lys Leu Ala Gln 20 25 30 Lys Leu Gly Val Glu Gln Pro Thr Leu Tyr Trp His Val Lys Asn Lys 35 40 45 Arg Ala Leu Leu Asp Ala Leu Ala Ile Glu Met Leu Asp Arg His His 50 55 60 Thr His Phe Cys Pro Leu Glu Gly Glu Ser Trp Gln Asp Phe Leu Arg 65 70 75 80 Asn Asn Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asp Gly 85 90 95 Ala Lys Val His Leu Gly Thr Arg Pro Thr Glu Lys Gln Tyr Glu Thr 100 105 110 Leu Glu Asn Gln Leu Ala Phe Leu Cys Gln Gln Gly Phe Ser Leu Glu 115 120 125 Asn Ala Leu Tyr Ala Leu Ser Ala Val Gly His Phe Thr Leu Gly Cys 130 135 140 Val Leu Glu Asp Gln Glu His Gln Val Ala Lys Glu Glu Arg Glu Thr 145 150 155 160 Pro Thr Thr Asp Ser Met Pro Pro Leu Leu Arg Gln Ala Ile Glu Leu 165 170 175 Phe Asp His Gln Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu Leu 180 185 190 Ile Ile Cys Gly Leu Glu Lys Gln Leu Lys Cys Glu Ser Gly Met His 195 200 205 Lys Gly Ala Pro Thr Gly Leu Gly Leu Ala Ser Phe Phe Gly Lys Ser 210 215 220 Phe Ile Phe His Thr Leu His Ala Ala Leu Pro Ala Leu Leu Glu Glu 225 230 235 240 Leu Ala Asn Thr Val Val Gly Thr Glu Leu Arg Arg Phe Val Leu Ala 245 250 255 Leu Ala Ala Ala Val Gly Leu Ser Ser Ser His Ala Glu Glu Leu Leu 260 265 270 His Arg Ala Val Ala Val Arg Ser Ser Arg Leu Glu Ser Ile Leu Pro 275 280 285 Ser Glu Thr Gly Leu Gly Phe Leu His Arg Asp Ala Gly Gly Ala Arg 290 295 300 Glu Glu Glu Leu Gly Ile Ile Ser Phe Cys Cys Val Thr Asn Asp Arg 305 310 315 320 Gln Pro Leu His Met Arg His Leu Val Thr Val Lys Asn Ile Phe Ser 325 330 335 Arg Gln Leu Pro Lys Met Pro Arg Glu Tyr Ile Val Arg Leu Val Phe 340 345 350 Asp Arg Ala His Phe Thr Phe Cys Leu Cys Lys Gln Gly Arg Val Ile 355 360 365 Gly Gly Val Cys Phe Arg Pro Tyr Phe Arg Glu Lys Phe Ala Glu Ile 370 375 380 Ala Phe Leu Ala Val Thr Ser Thr Glu Gln Val Lys Gly Tyr Gly Thr 385 390 395 400 Arg Leu Met Asn His Leu Lys Glu His Val Lys Lys Ser Gly Ile Glu 405 410 415 Tyr Phe Leu Thr Tyr Ala Asp Asn Phe Ala Val Gly Tyr Phe Arg Lys 420 425 430 Gln Gly Phe Ser Ser Lys Ile Thr Met Pro Arg Asp Arg Trp Leu Gly 435 440 445 Tyr Ile Lys Asp Tyr Asp Gly Gly Thr Leu Met Glu Cys Arg Leu Ser 450 455 460 Thr Arg Ile Asn Tyr Leu Lys Leu Ser Gln Leu Leu Ala Leu Gln Lys 465 470 475 480 Leu Ala Val Lys Arg Arg Ile Glu Gln Ser Ala Pro Ser Val Val Cys 485 490 495 Pro Ser Leu Ser Phe Trp Lys Glu Asn Pro Gly Gln Leu Leu Met Pro 500 505 510 Ser Ala Ile Pro Gly Leu Ala Glu Leu Asn Lys Asn Gly Glu Leu Ser 515 520 525 Leu Leu Leu Ser Ser Gly Arg Val Gly Ala Ala Pro Gln Gly Ser Gly 530 535 540 Ala Leu Pro Gly Gly Arg Thr Gly Ala Leu Gly Ser Lys Lys Gly Pro 545 550 555 560 Phe Gly Arg Ala Gly Phe Ala Lys Gly Glu Lys Gly Leu Arg Ala Ala 565 570 575 Ser Leu Lys Ala Gln Ile Ala Ala Leu Leu Ser Thr Leu Glu Lys His 580 585 590 Ser Ser Ser Trp Pro Phe Arg Arg Pro Val Ser Val Ser Glu Ala Pro 595 600 605 Asp Tyr Tyr Glu Val Val Arg Arg Pro Ile Asp Ile Ser Thr Met Lys 610 615 620 Lys Arg Asn Arg Asn Gly Asp Tyr Arg Thr Lys Glu Ala Phe Gln Glu 625 630 635 640 Asp Leu Leu Leu Met Phe Asp Asn Cys Arg Val Tyr Asn Ser Pro Asp 645 650 655 Thr Ile Tyr Tyr Lys Tyr Ala Asp Glu Leu Gln Ala Phe Ile Trp Pro 660 665 670 Lys Val Glu Ala Leu Gly Ser Phe 675 680 20 6216 DNA Artificial Sequence CDS (1167)..(1844) Description of Artificial Sequence pTub8TATi-1-HXGPRT 20 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga 60 ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct tcctttctcg 120 ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta gggttccgat 180 ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt tcacgtagtg 240 ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg ttctttaata 300 gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat tcttttgatt 360 tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat 420 ttaacgcgaa ttttaacaaa atattaacgc ttacaatttc cattcgccat tcaggctgcg 480 caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540 gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt cacgacgttg 600 taaaacgacg gccagtgagc gcgcgtaata cgactcacta tagggcgaat tgggtaccgg 660 gccccccctc gacggtatcg ataagcttaa ccacaaacct tgagacgcgt gttccaacca 720 cgcaccctga cacgcgtgtt ccaaccacgc accctgagac gcgtgttcta accacgcacc 780 ctgagacgcg tgttctaacc acgcaccctg agacgcgtgt tcaagcttgc ctgcattggg 840 tgcggttggt gatcctggtt ggaccggtgg agatgcgcgc gcacgaaggg gatgtgtcag 900 aaacattttg tttgttctct gtgaactttt agatgtgtta aaggcggcga atattagcag 960 agagtcctcc ttgttccatt ctctcttgaa tttcgccctt tccttctctt tgcgagtgtg 1020 gtagagaaca agcactcgtt cgccgtccct gacgacgcaa cccgcgcaga agacatccac 1080 caaacggtgt tacacaatca ccttgtgtga agttcttgcg gaaaactact cgttggcatt 1140 ttttcttgaa ttcctttttc gacaaa atg tcg cgc ctg gac aag agc aaa gtc 1193 Met Ser Arg Leu Asp Lys Ser Lys Val 1 5 atc aac tct gct ctg gaa tta ctc aat gaa gtc ggt atc gaa ggc ctg 1241 Ile Asn Ser Ala Leu Glu Leu Leu Asn Glu Val Gly Ile Glu Gly Leu 10 15 20 25 acg aca agg aaa ctc gct caa aag ctg gga gtt gag cag cct acc ctg 1289 Thr Thr Arg Lys Leu Ala Gln Lys Leu Gly Val Glu Gln Pro Thr Leu 30 35 40 tac tgg cac gtg aag aac aag cgg gcc ctg ctc gat gcc ctg gca atc 1337 Tyr Trp His Val Lys Asn Lys Arg Ala Leu Leu Asp Ala Leu Ala Ile 45 50 55 gag atg ctg gac agg cat cat acc cac ttc tgc ccc ctg gaa ggc gag 1385 Glu Met Leu Asp Arg His His Thr His Phe Cys Pro Leu Glu Gly Glu 60 65 70 tca tgg caa gac ttt ctg cgg aac aac gcc aag tca ttc cgc tgt gct 1433 Ser Trp Gln Asp Phe Leu Arg Asn Asn Ala Lys Ser Phe Arg Cys Ala 75 80 85 ctc ctc tca cat cgc gac ggg gct aaa gtg cat ctc ggc acc cgc cca 1481 Leu Leu Ser His Arg Asp Gly Ala Lys Val His Leu Gly Thr Arg Pro 90 95 100 105 aca gag aaa cag tac gaa acc ctg gaa aat cag ctc gcg ttc ctg tgt 1529 Thr Glu Lys Gln Tyr Glu Thr Leu Glu Asn Gln Leu Ala Phe Leu Cys 110 115 120 cag caa ggc ttc tcc ctg gag aac gca ctg tac gct ctg tcc gcc gtg 1577 Gln Gln Gly Phe Ser Leu Glu Asn Ala Leu Tyr Ala Leu Ser Ala Val 125 130 135 ggc cac ttt aca ctg ggc tgc gta ttg gag gat cag gag cat caa gta 1625 Gly His Phe Thr Leu Gly Cys Val Leu Glu Asp Gln Glu His Gln Val 140 145 150 gca aaa gag gaa aga gag aca cct acc acc gat tct atg ccc cca ctt 1673 Ala Lys Glu Glu Arg Glu Thr Pro Thr Thr Asp Ser Met Pro Pro Leu 155 160 165 ctg aga caa gca att gag ctg ttc gac cat cag gga gcc gaa cct gcc 1721 Leu Arg Gln Ala Ile Glu Leu Phe Asp His Gln Gly Ala Glu Pro Ala 170 175 180 185 ttc ctt ttc ggc ctg gaa cta atc ata tgt ggc ctg gag aaa ccc acg 1769 Phe Leu Phe Gly Leu Glu Leu Ile Ile Cys Gly Leu Glu Lys Pro Thr 190 195 200 ttc ttt aat agt gga ctc ttg ttc caa act gga aca aca ctc aac cct 1817 Phe Phe Asn Ser Gly Leu Leu Phe Gln Thr Gly Thr Thr Leu Asn Pro 205 210 215 atc tcg gtc tat tct ttt gat tta taa gggattttgc cgatttcggc 1864 Ile Ser Val Tyr Ser Phe Asp Leu 220 225 ctattggtta aaaaatgagt tgatttacca aaaatttaac gcgaatttta acaaaatatt 1924 aacgcttaca atttaggtgg cccttttcgg ggaaatgtgc gcggaacccc tatttgttta 1984 tttttttaaa tacattcaaa tatgtatccg ctcatgagcc aataaccctg ataaatgctt 2044 caataatatc aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaagga tccactagtt 2104 ctagagcggc cgccaccgcg gtgtcactgt agcctgccag aacacttgtc aaccgactgt 2164 gtccacattt ttatgcgcac tgactggcat gaatggccag aggcaggcat cagcaagtca 2224 cgtaggccaa cgcgtgcgca gaaacgctca aggctcgatt gtgggtgggg gttggtagca 2284 ttttatcgac ctaaacaagg tttacactta ggtggtgcgg ttttactgat ctggacggat 2344 tcagcggtcg cagattatcg atctgcaaat ggtgtacact taggtgtcgc ggcttattta 2404 gttaagggag cttcgtggtc ggagcctaac aagtcaacag agacgtatcg ccaatcgttc 2464 gcggtgaaga gtcgaaactg acagcacatc gtagggaaac tgagagggtg ctcctttctc 2524 tccgtcgttt gcgctgcacc atcctgcaag tgcatagaag gaaagttgtc tgctgtcgtg 2584 ggcagacagc aacagtccag cactctagcg gcatacagaa cgataacgca ttcacgagtg 2644 gatacacgca catctgcgtc acccgcaact cgctttcgtt ctgattgaca aaaagaaaac 2704 aaggcgaggt gagactgtgt gaaatgccac atgaagagtc atcccttttc ttcgataaag 2764 gacacagggg tctctggcac cccctcgtca gctctctccg acccgaggca ctctccctga 2824 tccctccgaa aagagaggaa aacgagagac gggcagcttc tgtagggcta tgcagggttt 2884 acttctcgaa ctttttgcga gcggcgtcgc tcaggacggc gacgtggtcg aagtcgcgga 2944 acatctcgtt gaagtcgtag cagcaaccaa cgatccagac gtcttcaatg ctgaagccga 3004 cgaagtcgcc cttcaagctg ttggagcgat ctgtgcgctt ctcgacgagg gtggcgattc 3064 tcatcgactt gggaccgacg gctttcaggc gctcaccgaa ctcggtgagg gtgaaaccgg 3124 tgtcgacgat gtcctcaaca atcagaacgt gcttgtcgcg aaagattgac aagtcgtcgc 3184 tcaagacggt gagctggcct gtgctgttgt cgttctggta ggacttcagg cggacatagt 3244 gctcgaagaa ggggggcacg ctggactcac gaccactgta cttctgtatg gtggcaaggt 3304 agtcgatcag aaggttgaag aagccgcgag agcctttcag gatgcaaatg atgtgcaact 3364 cctcgccgaa gtaagttctg tggatgtcat acgccaactt ctcaactctg tccttgacca 3424 atccaccagg gaggaggatt ttgtcaatgt agggcttgca gtgggggggc acaagaaagt 3484 catcagcgtt gtagaaggtg ttgtcgggga tatacatggg ctcaatacgg cccttgccct 3544 tgccgtagtc ttcaatgggt ttggacgcca ttttggatct gacaacgccc cgtagagcag 3604 aaacgcacta ctaaagcgaa acttcacccg tccctgctgc actcagagca gtgctccgca 3664 ctgccgtgtg gtaaaatgaa aaggttctac gagacacgcg tctccggatc gacaagcgaa 3724 ggatctgcac acctggtctc gatgtcgaac aaagcacgga ggagagacgg aaagtgctta 3784 catcgaacac ggttatcaaa cccgagaaaa agaaacgaac agaagaaaaa ggaaacctcc 3844 gcatactttt aaagaatgaa gttccccgat tttcccaaaa atggcgtcat tttcgcgcac 3904 ggcagtcaga taacaggtgt agcggctgcc caccaacaga gacggcgcgg ccgacaggac 3964 gctactggga ctgcgaacag cagcaagatc ggatcttccg cggtggagct ccagcttttg 4024 ttccctttag tgagggttaa ttgcgcgctt ggcgtaatca tggtcatagc tgtttcctgt 4084 gtgaaattgt tatccgctca caattccaca caacatacga gccggaagca taaagtgtaa 4144 agcctggggt gcctaatgag tgagctaact cacattaatt gcgttgcgct cactgcccgc 4204 tttccagtcg ggaaacctgt cgtgccagct gcattaatga atcggccaac gcgcggggag 4264 aggcggtttg cgtattgggc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt 4324 cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga 4384 atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg 4444 taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa 4504 aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt 4564 tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct 4624 gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct 4684 cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc 4744 cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt 4804 atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc 4864 tacagagttc ttgaagtggt ggcctaacta cggctacact agaaggacag tatttggtat 4924 ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa 4984 acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa 5044 aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga 5104 aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca cctagatcct 5164 tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa cttggtctga 5224 cagttaccaa tgcttaatca gtgaggcacc tatctcagcg atctgtctat ttcgttcatc 5284 catagttgcc tgactccccg tcgtgtagat aactacgata cgggagggct taccatctgg 5344 ccccagtgct gcaatgatac cgcgagaccc acgctcaccg gctccagatt tatcagcaat 5404 aaaccagcca gccggaaggg ccgagcgcag aagtggtcct gcaactttat ccgcctccat 5464 ccagtctatt aattgttgcc gggaagctag agtaagtagt tcgccagtta atagtttgcg 5524 caacgttgtt gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttg gtatggcttc 5584 attcagctcc ggttcccaac gatcaaggcg agttacatga tcccccatgt tgtgcaaaaa 5644 agcggttagc tccttcggtc ctccgatcgt tgtcagaagt aagttggccg cagtgttatc 5704 actcatggtt atggcagcac tgcataattc tcttactgtc atgccatccg taagatgctt 5764 ttctgtgact ggtgagtact caaccaagtc attctgagaa tagtgtatgc ggcgaccgag 5824 ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca catagcagaa ctttaaaagt 5884 gctcatcatt ggaaaacgtt cttcggggcg aaaactctca aggatcttac cgctgttgag 5944 atccagttcg atgtaaccca ctcgtgcacc caactgatct tcagcatctt ttactttcac 6004 cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg gaataagggc 6064 gacacggaaa tgttgaatac tcatactctt cctttttcaa tattattgaa gcatttatca 6124 gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata aacaaatagg 6184 ggttccgcgc acatttcccc gaaaagtgcc ac 6216 21 225 PRT Artificial Sequence Description of Artificial Sequence pTub8TATi-1-HXGPRT 21 Met Ser Arg Leu Asp Lys Ser Lys Val Ile Asn Ser Ala Leu Glu Leu 1 5 10 15 Leu Asn Glu Val Gly Ile Glu Gly Leu Thr Thr Arg Lys Leu Ala Gln 20 25 30 Lys Leu Gly Val Glu Gln Pro Thr Leu Tyr Trp His Val Lys Asn Lys 35 40 45 Arg Ala Leu Leu Asp Ala Leu Ala Ile Glu Met Leu Asp Arg His His 50 55 60 Thr His Phe Cys Pro Leu Glu Gly Glu Ser Trp Gln Asp Phe Leu Arg 65 70 75 80 Asn Asn Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asp Gly 85 90 95 Ala Lys Val His Leu Gly Thr Arg Pro Thr Glu Lys Gln Tyr Glu Thr 100 105 110 Leu Glu Asn Gln Leu Ala Phe Leu Cys Gln Gln Gly Phe Ser Leu Glu 115 120 125 Asn Ala Leu Tyr Ala Leu Ser Ala Val Gly His Phe Thr Leu Gly Cys 130 135 140 Val Leu Glu Asp Gln Glu His Gln Val Ala Lys Glu Glu Arg Glu Thr 145 150 155 160 Pro Thr Thr Asp Ser Met Pro Pro Leu Leu Arg Gln Ala Ile Glu Leu 165 170 175 Phe Asp His Gln Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu Leu 180 185 190 Ile Ile Cys Gly Leu Glu Lys Pro Thr Phe Phe Asn Ser Gly Leu Leu 195 200 205 Phe Gln Thr Gly Thr Thr Leu Asn Pro Ile Ser Val Tyr Ser Phe Asp 210 215 220 Leu 225 22 6392 DNA Artificial Sequence CDS (1167)..(1823) Description of Artificial Sequence pTub8TATi-3-HXGPRT 22 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga 60 ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct tcctttctcg 120 ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta gggttccgat 180 ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt tcacgtagtg 240 ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg ttctttaata 300 gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat tcttttgatt 360 tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat 420 ttaacgcgaa ttttaacaaa atattaacgc ttacaatttc cattcgccat tcaggctgcg 480 caactgttgg gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg 540 gggatgtgct gcaaggcgat taagttgggt aacgccaggg ttttcccagt cacgacgttg 600 taaaacgacg gccagtgagc gcgcgtaata cgactcacta tagggcgaat tgggtaccgg 660 gccccccctc gacggtatcg ataagcttaa ccacaaacct tgagacgcgt gttccaacca 720 cgcaccctga cacgcgtgtt ccaaccacgc accctgagac gcgtgttcta accacgcacc 780 ctgagacgcg tgttctaacc acgcaccctg agacgcgtgt tcaagcttgc ctgcattggg 840 tgcggttggt gatcctggtt ggaccggtgg agatgcgcgc gcacgaaggg gatgtgtcag 900 aaacattttg tttgttctct gtgaactttt agatgtgtta aaggcggcga atattagcag 960 agagtcctcc ttgttccatt ctctcttgaa tttcgccctt tccttctctt tgcgagtgtg 1020 gtagagaaca agcactcgtt cgccgtccct gacgacgcaa cccgcgcaga agacatccac 1080 caaacggtgt tacacaatca ccttgtgtga agttcttgcg gaaaactact cgttggcatt 1140 ttttcttgaa ttcctttttc gacaaa atg tcg cgc ctg gac aag agc aaa gtc 1193 Met Ser Arg Leu Asp Lys Ser Lys Val 1 5 atc aac tct gct ctg gaa tta ctc aat gaa gtc ggt atc gaa ggc ctg 1241 Ile Asn Ser Ala Leu Glu Leu Leu Asn Glu Val Gly Ile Glu Gly Leu 10 15 20 25 acg aca agg aaa ctc gct caa aag ctg gga gtt gag cag cct acc ctg 1289 Thr Thr Arg Lys Leu Ala Gln Lys Leu Gly Val Glu Gln Pro Thr Leu 30 35 40 tac tgg cac gtg aag aac aag cgg gcc ctg ctc gat gcc ctg gca atc 1337 Tyr Trp His Val Lys Asn Lys Arg Ala Leu Leu Asp Ala Leu Ala Ile 45 50 55 gag atg ctg gac agg cat cat acc cac ttc tgc ccc ctg gaa ggc gag 1385 Glu Met Leu Asp Arg His His Thr His Phe Cys Pro Leu Glu Gly Glu 60 65 70 tca tgg caa gac ttt ctg cgg aac aac gcc aag tca ttc cgc tgt gct 1433 Ser Trp Gln Asp Phe Leu Arg Asn Asn Ala Lys Ser Phe Arg Cys Ala 75 80 85 ctc ctc tca cat cgc gac ggg gct aaa gtg cat ctc ggc acc cgc cca 1481 Leu Leu Ser His Arg Asp Gly Ala Lys Val His Leu Gly Thr Arg Pro 90 95 100 105 aca gag aaa cag tac gaa acc ctg gaa aat cag ctc gcg ttc ctg tgt 1529 Thr Glu Lys Gln Tyr Glu Thr Leu Glu Asn Gln Leu Ala Phe Leu Cys 110 115 120 cag caa ggc ttc tcc ctg gag aac gca ctg tac gct ctg tcc gcc gtg 1577 Gln Gln Gly Phe Ser Leu Glu Asn Ala Leu Tyr Ala Leu Ser Ala Val 125 130 135 ggc cac ttt aca ctg ggc tgc gta ttg gag gat cag gag cat caa gta 1625 Gly His Phe Thr Leu Gly Cys Val Leu Glu Asp Gln Glu His Gln Val 140 145 150 gca aaa gag gaa aga gag aca cct acc acc gat tct atg ccc cca ctt 1673 Ala Lys Glu Glu Arg Glu Thr Pro Thr Thr Asp Ser Met Pro Pro Leu 155 160 165 ctg aga caa gca att gag ctg ttc gac cat cag gga gcc gaa cct gcc 1721 Leu Arg Gln Ala Ile Glu Leu Phe Asp His Gln Gly Ala Glu Pro Ala 170 175 180 185 ttc ctt ttc ggc ctg gaa cta atc ata tgt ggc ctg gag aaa cag ctg 1769 Phe Leu Phe Gly Leu Glu Leu Ile Ile Cys Gly Leu Glu Lys Gln Leu 190 195 200 act ctt gtt cca aac tgg aac aac act caa ccc tat ctc ggt cta ttc 1817 Thr Leu Val Pro Asn Trp Asn Asn Thr Gln Pro Tyr Leu Gly Leu Phe 205 210 215 ttt tga tttataaggg attttgccga tttcggccta ttggttaaaa aatgagctga 1873 Phe tttaacaaaa atttaacgcg aattttaaca aaatattaac gcttacaatt taggtggcac 1933 ttttcgggga aatgtgcgcg gaacccctat ttgtttattt ttctaaatac attcaaatat 1993 gtatccgctc atgagacaat aaccctgata aatgcttcaa taatattgaa aaaggaagag 2053 tatgagtatt caacatttcc gtgtcgccct tattcccttt tttgcggcat tttgccttcc 2113 tgtttttgct cacccagaaa cgctggtgaa agtaaaagat gctgaagatc agttgggtgc 2173 accgagtggg ttacatcgaa ctggatctca acagcggtaa gatccttgag agttttccca 2233 cccaaaaaca cccccaaaca caaaaaaaaa caaaggatcc actagttcta gagcggccgc 2293 caccgcggtg tcactgtagc ctgccagaac acttgtcaac cgactgtgtc cacattttta 2353 tgcgcactga ctggcatgaa tggccagagg caggcatcag caagtcacgt aggccaacgc 2413 gtgcgcagaa acgctcaagg ctcgattgtg ggtgggggtt ggtagcattt tatcgaccta 2473 aacaaggttt acacttaggt ggtgcggttt tactgatctg gacggattca gcggtcgcag 2533 attatcgatc tgcaaatggt gtacacttag gtgtcgcggc ttatttagtt aagggagctt 2593 cgtggtcgga gcctaacaag tcaacagaga cgtatcgcca atcgttcgcg gtgaagagtc 2653 gaaactgaca gcacatcgta gggaaactga gagggtgctc ctttctctcc gtcgtttgcg 2713 ctgcaccatc ctgcaagtgc atagaaggaa agttgtctgc tgtcgtgggc agacagcaac 2773 agtccagcac tctagcggca tacagaacga taacgcattc acgagtggat acacgcacat 2833 ctgcgtcacc cgcaactcgc tttcgttctg attgacaaaa agaaaacaag gcgaggtgag 2893 actgtgtgaa atgccacatg aagagtcatc ccttttcttc gataaaggac acaggggtct 2953 ctggcacccc ctcgtcagct ctctccgacc cgaggcactc tccctgatcc ctccgaaaag 3013 agaggaaaac gagagacggg cagcttctgt agggctatgc agggtttact tctcgaactt 3073 tttgcgagcg gcgtcgctca ggacggcgac gtggtcgaag tcgcggaaca tctcgttgaa 3133 gtcgtagcag caaccaacga tccagacgtc ttcaatgctg aagccgacga agtcgccctt 3193 caagctgttg gagcgatctg tgcgcttctc gacgagggtg gcgattctca tcgacttggg 3253 accgacggct ttcaggcgct caccgaactc ggtgagggtg aaaccggtgt cgacgatgtc 3313 ctcaacaatc agaacgtgct tgtcgcgaaa gattgacaag tcgtcgctca agacggtgag 3373 ctggcctgtg ctgttgtcgt tctggtagga cttcaggcgg acatagtgct cgaagaaggg 3433 gggcacgctg gactcacgac cactgtactt ctgtatggtg gcaaggtagt cgatcagaag 3493 gttgaagaag ccgcgagagc ctttcaggat gcaaatgatg tgcaactcct cgccgaagta 3553 agttctgtgg atgtcatacg ccaacttctc aactctgtcc ttgaccaatc caccagggag 3613 gaggattttg tcaatgtagg gcttgcagtg ggggggcaca agaaagtcat cagcgttgta 3673 gaaggtgttg tcggggatat acatgggctc aatacggccc ttgcccttgc cgtagtcttc 3733 aatgggtttg gacgccattt tggatctgac aacgccccgt agagcagaaa cgcactacta 3793 aagcgaaact tcacccgtcc ctgctgcact cagagcagtg ctccgcactg ccgtgtggta 3853 aaatgaaaag gttctacgag acacgcgtct ccggatcgac aagcgaagga tctgcacacc 3913 tggtctcgat gtcgaacaaa gcacggagga gagacggaaa gtgcttacat cgaacacggt 3973 tatcaaaccc gagaaaaaga aacgaacaga agaaaaagga aacctccgca tacttttaaa 4033 gaatgaagtt ccccgatttt cccaaaaatg gcgtcatttt cgcgcacggc agtcagataa 4093 caggtgtagc ggctgcccac caacagagac ggcgcggccg acaggacgct actgggactg 4153 cgaacagcag caagatcgga tcttccgcgg tggagctcca gcttttgttc cctttagtga 4213 gggttaattg cgcgcttggc gtaatcatgg tcatagctgt ttcctgtgtg aaattgttat 4273 ccgctcacaa ttccacacaa catacgagcc ggaagcataa agtgtaaagc ctggggtgcc 4333 taatgagtga gctaactcac attaattgcg ttgcgctcac tgcccgcttt ccagtcggga 4393 aacctgtcgt gccagctgca ttaatgaatc ggccaacgcg cggggagagg cggtttgcgt 4453 attgggcgct cttccgcttc ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg 4513 cgagcggtat cagctcactc aaaggcggta atacggttat ccacagaatc aggggataac 4573 gcaggaaaga acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg 4633 ttgctggcgt ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgacgctca 4693 agtcagaggt ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc 4753 tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc 4813 ccttcgggaa gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag 4873 gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc 4933 ttatccggta actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca 4993 gcagccactg gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg 5053 aagtggtggc ctaactacgg ctacactaga aggacagtat ttggtatctg cgctctgctg 5113 aagccagtta ccttcggaaa aagagttggt agctcttgat ccggcaaaca aaccaccgct 5173 ggtagcggtg gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa 5233 gaagatcctt tgatcttttc tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa 5293 gggattttgg tcatgagatt atcaaaaagg atcttcacct agatcctttt aaattaaaaa 5353 tgaagtttta aatcaatcta aagtatatat gagtaaactt ggtctgacag ttaccaatgc 5413 ttaatcagtg aggcacctat ctcagcgatc tgtctatttc gttcatccat agttgcctga 5473 ctccccgtcg tgtagataac tacgatacgg gagggcttac catctggccc cagtgctgca 5533 atgataccgc gagacccacg ctcaccggct ccagatttat cagcaataaa ccagccagcc 5593 ggaagggccg agcgcagaag tggtcctgca actttatccg cctccatcca gtctattaat 5653 tgttgccggg aagctagagt aagtagttcg ccagttaata gtttgcgcaa cgttgttgcc 5713 attgctacag gcatcgtggt gtcacgctcg tcgtttggta tggcttcatt cagctccggt 5773 tcccaacgat caaggcgagt tacatgatcc cccatgttgt gcaaaaaagc ggttagctcc 5833 ttcggtcctc cgatcgttgt cagaagtaag ttggccgcag tgttatcact catggttatg 5893 gcagcactgc ataattctct tactgtcatg ccatccgtaa gatgcttttc tgtgactggt 5953 gagtactcaa ccaagtcatt ctgagaatag tgtatgcggc gaccgagttg ctcttgcccg 6013 gcgtcaatac gggataatac cgcgccacat agcagaactt taaaagtgct catcattgga 6073 aaacgttctt cggggcgaaa actctcaagg atcttaccgc tgttgagatc cagttcgatg 6133 taacccactc gtgcacccaa ctgatcttca gcatctttta ctttcaccag cgtttctggg 6193 tgagcaaaaa caggaaggca aaatgccgca aaaaagggaa taagggcgac acggaaatgt 6253 tgaatactca tactcttcct ttttcaatat tattgaagca tttatcaggg ttattgtctc 6313 atgagcggat acatatttga atgtatttag aaaaataaac aaataggggt tccgcgcaca 6373 tttccccgaa aagtgccac 6392 23 218 PRT Artificial Sequence Description of Artificial Sequence pTub8TATi-3-HXGPRT 23 Met Ser Arg Leu Asp Lys Ser Lys Val Ile Asn Ser Ala Leu Glu Leu 1 5 10 15 Leu Asn Glu Val Gly Ile Glu Gly Leu Thr Thr Arg Lys Leu Ala Gln 20 25 30 Lys Leu Gly Val Glu Gln Pro Thr Leu Tyr Trp His Val Lys Asn Lys 35 40 45 Arg Ala Leu Leu Asp Ala Leu Ala Ile Glu Met Leu Asp Arg His His 50 55 60 Thr His Phe Cys Pro Leu Glu Gly Glu Ser Trp Gln Asp Phe Leu Arg 65 70 75 80 Asn Asn Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asp Gly 85 90 95 Ala Lys Val His Leu Gly Thr Arg Pro Thr Glu Lys Gln Tyr Glu Thr 100 105 110 Leu Glu Asn Gln Leu Ala Phe Leu Cys Gln Gln Gly Phe Ser Leu Glu 115 120 125 Asn Ala Leu Tyr Ala Leu Ser Ala Val Gly His Phe Thr Leu Gly Cys 130 135 140 Val Leu Glu Asp Gln Glu His Gln Val Ala Lys Glu Glu Arg Glu Thr 145 150 155 160 Pro Thr Thr Asp Ser Met Pro Pro Leu Leu Arg Gln Ala Ile Glu Leu 165 170 175 Phe Asp His Gln Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu Leu 180 185 190 Ile Ile Cys Gly Leu Glu Lys Gln Leu Thr Leu Val Pro Asn Trp Asn 195 200 205 Asn Thr Gln Pro Tyr Leu Gly Leu Phe Phe 210 215 24 77 DNA Artificial Sequence Description of Artificial Sequence TATi-1 nucleotide sequence of activating domain 24 cccacgttct ttaatagtgg actcttgttc caaactggaa caacactcaa ccctatctcg 60 gtctattctt ttgattt 77 25 26 PRT Artificial Sequence Description of Artificial Sequence TATi-1 presumed amino acid sequence of activating domain 25 Pro Thr Phe Phe Asn Ser Gly Gly Leu Leu Phe Gln Thr Thr Thr Leu 1 5 10 15 Asn Pro Ile Ser Val Tyr Ser Phe Asp Leu 20 25 26 224 PRT Artificial Sequence Description of Artificial Sequence TATi-1 26 Met Ser Arg Leu Asp Lys Ser Lys Val Ile Asn Ser Ala Leu Glu Leu 1 5 10 15 Leu Asn Glu Val Gly Ile Glu Gly Leu Thr Thr Arg Lys Leu Ala Gln 20 25 30 Lys Leu Gly Val Glu Gln Pro Thr Leu Tyr Trp His Val Lys Asn Lys 35 40 45 Arg Ala Leu Leu Asp Ala Leu Ala Ile Glu Met Leu Asp Arg His His 50 55 60 Thr His Phe Cys Pro Leu Glu Gly Glu Ser Trp Gln Asp Phe Leu Arg 65 70 75 80 Asn Asn Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asp Gly 85 90 95 Ala Lys Val His Leu Gly Thr Arg Pro Thr Glu Lys Gln Tyr Glu Thr 100 105 110 Leu Glu Asn Gln Leu Ala Phe Leu Cys Gln Gln Gly Phe Ser Leu Glu 115 120 125 Asn Ala Leu Tyr Ala Leu Ser Ala Val Gly His Phe Thr Leu Gly Cys 130 135 140 Val Leu Glu Asp Gln Glu His Gln Val Ala Lys Glu Glu Arg Glu Thr 145 150 155 160 Pro Thr Thr Asp Ser Met Pro Pro Leu Leu Arg Gln Ala Ile Glu Leu 165 170 175 Phe Asp His Gln Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu Leu 180 185 190 Ile Ile Cys Gly Leu Glu Lys Pro Thr Phe Phe Asn Ser Gly Leu Leu 195 200 205 Phe Gln Thr Gly Thr Thr Leu Asn Pro Ile Ser Val Tyr Ser Phe Asp 210 215 220 27 57 DNA Artificial Sequence Description of Artificial Sequence TATi-3 nucleotide sequence of activating domain 27 cagctgactc ttgttccaaa ctggaacaac actcaaccct atctcggtct attcttt 57 28 19 PRT Artificial Sequence Description of Artificial Sequence TATi-3 presumed amino acid sequence of activating domain 28 Gln Leu Thr Leu Val Pro Asn Trp Asn Asn Thr Gln Pro Tyr Leu Gly 1 5 10 15 Leu Phe Phe 29 218 PRT Artificial Sequence Description of Artificial Sequence TATi-3 29 Met Ser Arg Leu Asp Lys Ser Lys Val Ile Asn Ser Ala Leu Glu Leu 1 5 10 15 Leu Asn Glu Val Gly Ile Glu Gly Leu Thr Thr Arg Lys Leu Ala Gln 20 25 30 Lys Leu Gly Val Glu Gln Pro Thr Leu Tyr Trp His Val Lys Asn Lys 35 40 45 Arg Ala Leu Leu Asp Ala Leu Ala Ile Glu Met Leu Asp Arg His His 50 55 60 Thr His Phe Cys Pro Leu Glu Gly Glu Ser Trp Gln Asp Phe Leu Arg 65 70 75 80 Asn Asn Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asp Gly 85 90 95 Ala Lys Val His Leu Gly Thr Arg Pro Thr Glu Lys Gln Tyr Glu Thr 100 105 110 Leu Glu Asn Gln Leu Ala Phe Leu Cys Gln Gln Gly Phe Ser Leu Glu 115 120 125 Asn Ala Leu Tyr Ala Leu Ser Ala Val Gly His Phe Thr Leu Gly Cys 130 135 140 Val Leu Glu Asp Gln Glu His Gln Val Ala Lys Glu Glu Arg Glu Thr 145 150 155 160 Pro Thr Thr Asp Ser Met Pro Pro Leu Leu Arg Gln Ala Ile Glu Leu 165 170 175 Phe Asp His Gln Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu Leu 180 185 190 Ile Ile Cys Gly Leu Glu Lys Gln Leu Thr Leu Val Pro Asn Trp Asn 195 200 205 Asn Thr Gln Pro Tyr Leu Gly Leu Phe Phe 210 215 30 58 DNA Artificial Sequence Description of Artificial Sequence Oligonucleotide 30 cggaattcct tttcgacaaa atgtcgcgcc tggacaagag caaagtcatc aactctgc 58 31 37 DNA Artificial Sequence Description of Artificial Sequence Oligonucleotide 31 cccttaatta atgcataccg ctttcgcact tcagctg 37 

What is claimed is:
 1. A nucleic acid construct comprising a tetracycline repressor operatively linked to a transacting factor of T. gondii.
 2. A nucleic acid construct as claimed in claim 1, wherein the transacting factor of T. gondii comprises a nucleic acid sequence selected from a group consisting of TATi-1 activating domain, TATi-3 activating domain, and a sequence complementary or homologous thereto.
 3. A transcriptional activator of T. gondii comprising an amino acid sequence of TATi-1 or TATi-3, or an analog, homolog, ortholog, related polypeptide, derivative, fragment or isoform thereof.
 4. A transacting factor of T. gondii comprising an amino acid sequence of TATi-1 or TATi-3 activating domain, or an analog, homolog, ortholog, related polypeptide, derivative, fragment or isoform thereof.
 5. A vector comprising a nucleic acid construct as defined in claim
 1. 6. An expression vector comprising a nucleic acid construct as defined in claim
 1. 7. An Apicomplexan tetracycline-inducible transactivator system, comprising a tetracycline repressor and a transacting factor of T. gondii.
 8. A tetracycline-inducible transactivator system, comprising a tetracycline repressor and a transacting factor of T. gondii for use in Apicomplexan species.
 9. A host cell transformed with a nucleic acid construct as defined in claim 1, or a vector as claimed in claim
 4. 10. A host cell as claimed in claim 9, which is an Apicomplexan host cell.
 11. A host cell as claimed in claim 10, in which the Apicomplexan cell is selected from the group consisting of Toxoplasma gondii, Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii, Plasmodium knowlesi, Trypanosoma brucei, Entamoeba histolytica, and Giardia lambia.
 12. A nucleic acid construct as defined in claim 1 for use in medicine.
 13. A host cell as defined in claim 9 for use in medicine.
 14. A method of treatment for or prevention of an infection caused by a protozoan, selected from the group consisting of Toxoplasma gondii, Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii, Plasmodium knowlesi, Trypanosoma brucei, Entamoeba histolytica and Giardia lambia, comprising administration to a subject of a nucleic acid construct as defined in claim
 1. 15. A method of treatment for or prevention of an infection caused by a protozoan, selected from the group consisting of Toxoplasma gondii, Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii, Plasmodium knowlesi, Trypanosoma brucei, Entamoeba histolytica and Giardia lambia, comprising administration to a subject of a host cell as defined in claim
 9. 16. A method of treatment as claimed in claim 14, in which the protozoan is Toxoplasma gondii.
 17. A method of treatment as claimed in claim 14, in which the protozoan is a Plasmodium species.
 18. A vaccine composition comprising a protozoan selected from the group consisting of Toxoplasma gondii, Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii, Plasmodium knowlesi, Trypanosoma brucei, Entamoeba histolytica and Giardia lambia transfected with a nucleic acid construct as defined in claim
 1. 19. The use of a nucleic acid construct as defined in claim 1 in the preparation of a vaccine for use in the treatment or prophylaxis of an infection caused by a protozoan selected from the group consisting of Toxoplasma gondii, Plasmodium falciparum, Plasmodium vivax, Plasmodium berghei, Plasmodium yoelii, Plasmodium knowlesi, Trypanosoma brucei, Entamoeba histolytica and Giardia lambia.
 20. A kit of parts comprising a host cell as defined in claim 9 and an administration vehicle selected from tablets for oral administration, inhalers for lung administration, and injectable solutions for intravenous administration. 